| Title: | Dynamic Modeling and Machine Learning Environment |
|---|---|
| Description: | Estimates, predict and forecast dynamic models as well as Machine Learning metrics which assists in model selection for further analysis. The package also have capabilities to provide tools and metrics that are useful in machine learning and modeling. For example, there is quick summary, percent sign, Mallow's Cp tools and others. The ecosystem of this package is analysis of economic data for national development. The package is so far stable and has high reliability and efficiency as well as time-saving. |
| Authors: | Job Nmadu [aut, cre] (ORCID: <https://orcid.org/0000-0002-1320-8957>) |
| Maintainer: | Job Nmadu <[email protected]> |
| License: | GPL (>= 3) + file LICENSE |
| Version: | 11.11.26 |
| Built: | 2026-06-04 07:02:37 UTC |
| Source: | https://github.com/JobNmadu/Dyn4cast |
This function estimates the lower and upper 80% and 95% forecasts of the Model. The final values are within the lower and upper limits of the base data. Used in conjunction with <scaled_logit> and <inv_scaled_logit> functions, they are adapted from Hyndman & Athanasopoulos (2021) and modified for independent use rather than be restricted to be used with a particular package.
constrainedforecast(model10, lower, upper)constrainedforecast(model10, lower, upper)
model10 |
This is the exponential values from the |
lower |
The lower limit of the forecast |
upper |
The upper limit of the forecast |
A list of forecast values within 80% and 95% confidence band. The values are:
Lower 80% |
Forecast at lower 80% confidence level. |
Upper 80% |
Forecast at upper 80% confidence level. |
Lower 95% |
Forecast at lower 95% confidence level. |
Upper 95% |
Forecast at upper 95% confidence level. |
library(Dyn4cast) library(splines) library(forecast) library(readr) lower <- 1 upper <- 37 Model <- lm(states ~ bs(sequence, knots = c(30, 115)), data = Data) FitModel <- scaledlogit(x2 = fitted.values(Model), lower = lower, upper = upper) ForecastModel <- forecast(FitModel, h = length(200)) ForecastValues <- constrainedforecast(model10 = ForecastModel, lower, upper)library(Dyn4cast) library(splines) library(forecast) library(readr) lower <- 1 upper <- 37 Model <- lm(states ~ bs(sequence, knots = c(30, 115)), data = Data) FitModel <- scaledlogit(x2 = fitted.values(Model), lower = lower, upper = upper) ForecastModel <- forecast(FitModel, h = length(200)) ForecastValues <- constrainedforecast(model10 = ForecastModel, lower, upper)
This is a custom plot for correlation matrix in which the coefficients are displayed along with graphics showing the magnitude of each coefficient.
corplot(r)corplot(r)
r |
Correlation matrix of the data for the plot |
The function returns a custom plot of the correlation matrix
corplot |
The custom plot of the correlation matrix |
data.frame for comparable Machine Learning prediction
and visualizationOften economic and other Machine Learning data are of different units
or sizes making either estimation, interpretation or visualization difficult.
The solution to these issues can be handled if the data can be transformed
into unitless or data of similar magnitude. This is what data_transform
is set to do. It is simple and straight forward to use.
data_transform(data, method, margin = 2)data_transform(data, method, margin = 2)
data |
A |
method |
The type of transformation. There three options. |
margin |
Option to either transform the data |
This function returns the output of the data transformation process as
data_transformed |
A new |
library(Dyn4cast) library(tidyverse) # View the data without transformation data0 <- Transform %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data0, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12) # Example 1: Transformation by `min-max` method. # You could also transform the `X column` but is is better not to. data1 <- data_transform(Transform[, -1], 1) data1 <- cbind(Transform[, 1], data1) data1 <- data1 %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data1, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12) # Example 2: `log` transformation data2 <- data_transform(Transform[, -1], 2) data2 <- cbind(Transform[, 1], data2) data2 <- data2 %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data2, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12) # Example 3: `Mean-SD` transformation data3 <- data_transform(Transform[, -1], 3) data3 <- cbind(Transform[, 1], data3) data3 <- data3 %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data3, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12)library(Dyn4cast) library(tidyverse) # View the data without transformation data0 <- Transform %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data0, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12) # Example 1: Transformation by `min-max` method. # You could also transform the `X column` but is is better not to. data1 <- data_transform(Transform[, -1], 1) data1 <- cbind(Transform[, 1], data1) data1 <- data1 %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data1, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12) # Example 2: `log` transformation data2 <- data_transform(Transform[, -1], 2) data2 <- cbind(Transform[, 1], data2) data2 <- data2 %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data2, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12) # Example 3: `Mean-SD` transformation data3 <- data_transform(Transform[, -1], 3) data3 <- cbind(Transform[, 1], data3) data3 <- data3 %>% pivot_longer(!X, names_to = "Factors", values_to = "Data") ggplot(data = data3, aes(x = X, y = Data, fill = Factors, color = Factors)) + geom_line() + scale_fill_brewer(palette = "Set1") + scale_color_brewer(palette = "Set1") + labs(y = "Data", x = "Series", color = "Factors") + theme_bw(base_size = 12)
These functions have been removed (defunct) as indicated against their names.
Model_factors(...)Model_factors(...)
... |
Not used. |
None, function now defunct.
# Non# Non
The function estimates, predict and forecast time series data with models,
and also make subset forecasts within the length of the entire trend of the
data. The recognized models are lm, smooth spline, polynomial splines with or
without knots, quadratic polynomial, and ARIMA. The robust output include
the models' estimates, time-varying forecasts and plots based on themes
from ggplot. The main attraction of this function is the use of the newly
introduced equal number of trend to forecast from the model.
The function takes daily, monthly and yearly data sets for now.
DynamicForecast(Data, date, series, dyrima, Trend, Type, MaximumDate, x = 0, x100 = 0, BREAKS = 0, ORIGIN = NULL, origin = "1970-01-01", Length = 0, ...)DynamicForecast(Data, date, series, dyrima, Trend, Type, MaximumDate, x = 0, x100 = 0, BREAKS = 0, ORIGIN = NULL, origin = "1970-01-01", Length = 0, ...)
A list with the following components:
Spline without knots |
The estimated spline model without the breaks (knots). |
Spline with knots |
The estimated spline model with the breaks (knots). |
Smooth Spline |
The smooth spline estimates. |
ARIMA |
Estimated Auto Regressive Integrated Moving Average model. |
Quadratic |
The estimated quadratic polynomial model. |
Ensembled with equal weight |
Estimated Ensemble model with equal weight given to each of the models. To get this, the fitted values of each of the models is divided by the number of models and summed together. |
Ensembled based on weight |
Estimated Ensemble model based on weight of each model. To do this, the fitted values of each model served as independent variable and regressed against the trend with interaction among the variables. |
Ensembled based on summed weight |
Estimated Ensemble model based on summed weight of each model. To do this, the fitted values of each model served as independent variable and is regressed against the trend. |
Ensembled based on weight of fit |
Estimated Ensemble model. The fit of each model is measured by the rmse. |
Unconstrained Forecast |
The forecast if the response variable is continuous. The number of forecasts is equivalent to the length of the dataset (equal days forecast). |
Constrained Forecast |
The forecast if the response variable is integer. The number of forecasts is equivalent to the length of the dataset (equal days forecast). |
RMSE |
Root Mean Square Error (rmse) for each forecast. |
Unconstrained forecast Plot |
The combined plots of the unconstrained forecasts using ggplot. |
Constrained forecast Plot |
The combined plots of the constrained forecasts using ggplot. |
Date |
This is the date range for the forecast. |
Fitted plot |
This is the plot of the fitted models. |
Estimated coefficients |
This is the estimated coefficients of the various models in the forecast. |
# library(readr) # library(forecast) # COVID19$Date <- zoo::as.Date(COVID19$Date, format = '%m/%d/%Y') # #The date is formatted to R format # LEN <- length(COVID19$Case) # Dss <- seq(COVID19$Date[1], by = "day", length.out = LEN) # #data length for forecast # ORIGIN = "2020-02-29" # lastdayfo21 <- Dss[length(Dss)] # The maximum length # uncomment to run # Data <- COVID19[COVID19$Date <= lastdayfo21 - 28, ] # # desired length of forecast # BREAKS <- c(70, 131, 173, 228, 274) # The default breaks for the data # dyrima <- auto.arima(Data$Case) # DynamicForecast(date = Data$Date, series = Data$Case, dyrima = dyrima, # BREAKS = BREAKS, Trend = "Day", Length = 0, Type = "Integer", x100 = 0) # # lastdayfo21 <- Dss[length(Dss)] # Data <- COVID19[COVID19$Date <= lastdayfo21 - 14, ] # BREAKS = c(70, 131, 173, 228, 274) # dyrima <- auto.arima(Data$Case) # DynamicForecast(date = Data$Date, series = Data$Case, dyrima = dyrima, # BREAKS = BREAKS , Trend = "Day", Length = 0, Type = "Integer", x100 = 0)# library(readr) # library(forecast) # COVID19$Date <- zoo::as.Date(COVID19$Date, format = '%m/%d/%Y') # #The date is formatted to R format # LEN <- length(COVID19$Case) # Dss <- seq(COVID19$Date[1], by = "day", length.out = LEN) # #data length for forecast # ORIGIN = "2020-02-29" # lastdayfo21 <- Dss[length(Dss)] # The maximum length # uncomment to run # Data <- COVID19[COVID19$Date <= lastdayfo21 - 28, ] # # desired length of forecast # BREAKS <- c(70, 131, 173, 228, 274) # The default breaks for the data # dyrima <- auto.arima(Data$Case) # DynamicForecast(date = Data$Date, series = Data$Case, dyrima = dyrima, # BREAKS = BREAKS, Trend = "Day", Length = 0, Type = "Integer", x100 = 0) # # lastdayfo21 <- Dss[length(Dss)] # Data <- COVID19[COVID19$Date <= lastdayfo21 - 14, ] # BREAKS = c(70, 131, 173, 228, 274) # dyrima <- auto.arima(Data$Case) # DynamicForecast(date = Data$Date, series = Data$Case, dyrima = dyrima, # BREAKS = BREAKS , Trend = "Day", Length = 0, Type = "Integer", x100 = 0)
This function provides graphic displays of the estimated coefficients in the order of their significance in the models. This would assists in accessing models to decide which can be used for further analysis, prediction and policy consideration.
estimate_plot(model25, limit)estimate_plot(model25, limit)
model25 |
Estimated model for which the estimated coefficients would be plotted |
limit |
Number of variables to be included in the coefficients plots |
The function returns a plot of the order of importance of the estimated coefficients
estimate_plot |
The plot of the order of importance of estimated coefficients |
Often, when a continuous data is converted to factors using the base R
cut function, the resultant Class Interval column provide data with
scientific notation which normally appears confusing to interpret,
especially to casual data scientist. This function provide a more
user-friendly output and is provided in a formatted manner. It is a easy to
implement function.
formattedcut(data, breaks, cut = FALSE)formattedcut(data, breaks, cut = FALSE)
data |
A vector of the data to be converted to factors if not cut already or the vector of a cut data |
breaks |
Number of classes to break the data into |
cut |
|
The function returns a data frame with three or four columns
i.e Lower class, Upper class, Class interval and Frequency (if the
cut is FALSE).
Cut |
The |
library(tidyverse) DD <- rnorm(100000) formattedcut(DD, 12, FALSE) DD1 <- cut(DD, 12) DDK <- formattedcut(DD1, 12, TRUE) DDK # if data is not from a data frame, the frequency distribution is required. as.data.frame(DDK %>% group_by(`Lower class`, `Upper class`, `Class interval`) %>% tally())library(tidyverse) DD <- rnorm(100000) formattedcut(DD, 12, FALSE) DD1 <- cut(DD, 12) DDK <- formattedcut(DD1, 12, TRUE) DDK # if data is not from a data frame, the frequency distribution is required. as.data.frame(DDK %>% group_by(`Lower class`, `Upper class`, `Class interval`) %>% tally())
There are three main types of ranking: Standard competition, Ordinal and Fractional. Garrett's Ranking Technique is the application of fractional ranking in which the data points are ordered and given an ordinal number/rank. The ordering and ranking provide additional information which may not be available from frequency distribution. Again, the ordering is based on the level of seriousness or severity of the data point from the view point of the respondent. Ranking enables ease of comparison and makes grouping more meaningful. It is used in social science, psychology and other survey types of research. This functions performs Garrett Ranking of up to 15 ranks.
garrett_ranking(data, num_rank, ranking = NULL, m_rank = c(2:15))garrett_ranking(data, num_rank, ranking = NULL, m_rank = c(2:15))
data |
The data for the Garrett Ranking, must be a |
num_rank |
A vector representing the number of ranks applied to the data. If the data is a five-point Likert-type data, then number of ranks is 5. |
ranking |
A vector of list representing the ranks applied to the data. If not available, positional ranks are applied. |
m_rank |
The scope of the ranking methods which is between 2 and 15. |
A list with the following components:
RII |
Relative importance index. |
Garrett ranked data |
Table of data ranked using Garrett mean score. |
Garrett value |
Table of ranking Garrett values |
library(readr) garrett_data <- data.frame(garrett_data) ranking <- c("Serious constraint", "Constraint", "Not certain it is a constraint", "Not a constraint", "Not a serious constraint") ## ranking is supplied garrett_ranking(garrett_data, 5, ranking) # ranking not supplied garrett_ranking(garrett_data, 5) # you can rank subset of the data garrett_ranking(garrett_data, 8) garrett_ranking(garrett_data, 4)library(readr) garrett_data <- data.frame(garrett_data) ranking <- c("Serious constraint", "Constraint", "Not certain it is a constraint", "Not a constraint", "Not a serious constraint") ## ranking is supplied garrett_ranking(garrett_data, 5, ranking) # ranking not supplied garrett_ranking(garrett_data, 5) # you can rank subset of the data garrett_ranking(garrett_data, 8) garrett_ranking(garrett_data, 4)
Often, there is need to differentiate between sex and gender. Many wonder if there is any difference at all. This function will create clarity between them.
gender(data)gender(data)
data |
data frame containing Age and Sex variables |
The data.frame with:
Gender |
data frame with two additional variables. |
# df <- data.frame(Age = c(49, 30, 44, 37, 29, 56, 28, 26, 33, 45, 45, 19, # 32, 22, 19, 28, 28, 36, 56, 34), # Sex = c("male", "female", "female", "male", "male", "male", "female", # "female", "Prefer not to say", "male", "male", "female", "female", "male", # "Non-binary/third gender", "male", "female", "female", "male", "male")) # gender(df)# df <- data.frame(Age = c(49, 30, 44, 37, 29, 56, 28, 26, 33, 45, 45, 19, # 32, 22, 19, 28, 28, 36, 56, 34), # Sex = c("male", "female", "female", "male", "male", "male", "female", # "female", "Prefer not to say", "male", "male", "female", "female", "male", # "Non-binary/third gender", "male", "female", "female", "male", "male")) # gender(df)
Vulnerability or to be vulnerable means the state or quality of being susceptible to physical or emotional harm, damage, or attack, usually indicating a lack of defense or protection, making someone or some systems more likely to be affected by external factors or threats. Therefore, vulnerability index is a quantitative or standardized framework of such state or quality which then makes comparisons between households, communities or systems possible. The index is made up of three main components: exposure, sensitivity and adaptive capacity. Each component has multiple indicators from wide ranges including social, medical, psychological and various extreme events like floods, drought, earthquakes etc. This function is for conversion of indicators exposure and sensitivity into a vector of index through normalization and weighting. The resulting index from each of the component is then combined via an appropriate model into vulnerability index.
index_construction(data)index_construction(data)
data |
Data frame of indicators of Exposure or Sensitivity. The data frame must be numeric. |
A list with the following components:
Indexed data |
|
Index |
A vector of indices representing the variable of interest, either Exposure or Sensitivity. |
library(readr) garrett_data <- data.frame(garrett_data) index_construction(garrett_data)library(readr) garrett_data <- data.frame(garrett_data) index_construction(garrett_data)
This function is used to estimate exponential lower (80% and 95%) and upper
(80% and 95%) values from the outcome of the scaledlogit function.
The exponentiation ensures that the forecast does not go beyond the upper
and lower limits of the base data.
invscaledlogit(x, x3, lower, upper)invscaledlogit(x, x3, lower, upper)
x |
|
x3 |
The forecast values from constrained forecast package. Please specify the appropriate column containing the forecast values. |
lower |
Lower limits of the forecast values |
upper |
Upper limits of the forecast values |
x3 <- 1:35 lower <- 1 upper <- 35 invscaledlogit(x3 = x3, lower = lower, upper = upper)x3 <- 1:35 lower <- 1 upper <- 35 invscaledlogit(x3 = x3, lower = lower, upper = upper)
The linear model still remains a reference point towards advanced modeling of some datasets as foundation for Machine Learning, Data Science and Artificial Intelligence in spite of some of her weaknesses. The major task in modeling is to compare various models before a selection is made for one or for advanced modeling. Often, some trial and error methods are used to decide which model to select. This is where this function is unique. It helps to estimate 14 different linear models and provide their coefficients in a formatted Table for quick comparison so that time and energy are saved. The interesting thing about this function is the simplicity, and it is a one line code.
Linearsystems(y, x, mod, limit, Test = NA)Linearsystems(y, x, mod, limit, Test = NA)
y |
Vector of the dependent variable. This must be numeric. |
x |
Data frame of the explanatory variables. |
mod |
The group of linear models to be estimated. It takes value from 0 to 6. 0 = EDA (correlation, summary tables, Visuals means); 1 = Linear systems, 2 = power models, 3 = polynomial models, 4 = root models, 5 = inverse models, 6 = all the 14 models |
limit |
Number of variables to be included in the coefficients plots |
Test |
test data to be used to predict y. If not supplied, the fitted y is used hence may be identical with the fitted value. It is important to be cautious if the data is to be divided between train and test subsets in order to train and test the model. If the sample size is not sufficient to have enough data for the test, errors are thrown up. |
A list with the following components:
Visual means of the numeric variable |
Plot of the means of the numeric variables. |
Correlation plot |
Plot of the Correlation Matrix of the
numeric variables. To recover the plot, please use this canonical form
object$ |
Linear |
The full estimates of the Linear Model. |
Linear with interaction |
The full estimates of the Linear Model with full interaction among the numeric variables. |
Semilog |
The full estimates of the Semilog Model. Here the independent variable(s) is/are log-transformed. |
Growth |
The full estimates of the Growth Model. Here the dependent variable is log-transformed. |
Double Log |
The full estimates of the double-log Model. Here the both the dependent and independent variables are log-transformed. |
Mixed-power model |
The full estimates of the Mixed-power Model. This is a combination of linear and double log models. It has significant gains over the two models separately. |
Translog model |
The full estimates of the double-log Model with full interaction of the numeric variables. |
Quadratic |
The full estimates of the Quadratic Model. Here the square of numeric independent variable(s) is/are included as independent variables. |
Cubic model |
The full estimates of the Cubic Model. Here the third-power (x^3) of numeric independent variable(s) is/are included as independent variables. |
Inverse y |
The full estimates of the Inverse Model. Here the dependent variable is inverse-transformed (1 / y). |
Inverse x |
The full estimates of the Inverse Model. Here the independent variable is inverse-transformed (1 / x). |
Inverse y & x |
The full estimates of the Inverse Model. Here the dependent and independent variables are inverse-transformed 1 / y & 1 / x). |
Square root |
The full estimates of the Square root Model. Here the independent variable is square root-transformed (x^0.5). |
Cubic root |
The full estimates of the cubic root Model. Here the independent variable is cubic root-transformed (x^1 / 3). |
Significant plot of Linear |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Linear with interaction |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Semilog |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Growth |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Double Log |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Mixed-power model |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Translog model |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Quadratic |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Cubic model |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Inverse y |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Inverse x |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Inverse y & x |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Square root |
Plots of order of importance and significance of estimates coefficients of the model. |
Significant plot of Cubic root |
Plots of order of importance and significance of estimates coefficients of the model. |
Model Table |
Formatted Tables of the coefficient estimates of all the models |
Machine Learning Metrics |
Metrics (47) for assessing model performance and metrics for diagnostic analysis of the error in estimation. |
Table of Marginal effects |
Tables of marginal effects of each model. Because of computational limitations, if you choose to estimate all the 14 models, the Tables are produced separately for the major transformations. They can easily be compiled into one. |
Fitted plots long format |
Plots of the fitted estimates from each of the model. |
Fitted plots wide format |
Plots of the fitted estimates from each of the model. |
Prediction plots long format |
Plots of the predicted estimates from each of the model. |
Prediction plots wide format |
Plots of the predicted estimates from each of the model. |
Naive effects plots long format |
Plots of the |
Naive effects plots wide format |
Plots of the |
Summary of numeric variables |
of the dataset. |
Summary of character variables |
of the dataset. |
## Without test data (not run) # library(tidyverse) # library(ggtext) # # y <- linearsystems$MKTcost # to run all the exercises, uncomment. # x <- select(linearsystems, -MKTcost) # Linearsystems(y, x, 6, 15) # NaNs produced if run ## Without test data (not run) # x <- sampling[, -1] # y <- sampling$qOutput # limit <- 20 # mod <-3 # Test <- NA # Linearsystems(y, x, 3, 15) # NaNs produced if run # # with test data # x <- sampling[, -1] # y <- sampling$qOutput # Data <- cbind(y, x) # # 80% of data is sampled # sampling <- sample(1 : nrow(Data), 0.8 * nrow(Data)) # # for training the model # train <- Data[sampling, ] # Test <- Data[-sampling, ] # # 20% of data is reserved for testing (predicting) the model # y <- train$y # x <- train[, -1] # mod <- 4 # Linearsystems(y, x, 4, 15, Test) # NaNs produced if run## Without test data (not run) # library(tidyverse) # library(ggtext) # # y <- linearsystems$MKTcost # to run all the exercises, uncomment. # x <- select(linearsystems, -MKTcost) # Linearsystems(y, x, 6, 15) # NaNs produced if run ## Without test data (not run) # x <- sampling[, -1] # y <- sampling$qOutput # limit <- 20 # mod <-3 # Test <- NA # Linearsystems(y, x, 3, 15) # NaNs produced if run # # with test data # x <- sampling[, -1] # y <- sampling$qOutput # Data <- cbind(y, x) # # 80% of data is sampled # sampling <- sample(1 : nrow(Data), 0.8 * nrow(Data)) # # for training the model # train <- Data[sampling, ] # Test <- Data[-sampling, ] # # 20% of data is reserved for testing (predicting) the model # y <- train$y # x <- train[, -1] # mod <- 4 # Linearsystems(y, x, 4, 15, Test) # NaNs produced if run
Mallow's Cp is one of the very useful metrics and selection criteria for
machine learning algorithms (models). It is used to estimate the closest
number to the number of predictors and the intercept (approximate number of
explanatory variables) of linear and non-linear based models. The function
inherits residuals from the estimated model. The uniqueness of this
function compared to other procedures for computing Mallow's Cp is that it
does not require nested models for computation and it is not limited to lm
based models only.
MallowsCp(model2, y, x, type, Nlevels = 0)MallowsCp(model2, y, x, type, Nlevels = 0)
model2 |
The estimated model from which the Mallows Cp would be computed |
y |
The vector of the LHS variable of the estimated model |
x |
The matrix of the RHS variable of the estimated model. Note
that if the model adds additional factor variables into the output, then
the number of additional factors |
type |
The type of model ( |
Nlevels |
Optional number of additional variables created if the model has categorical variables that generates additional dummy variables during estimation or the number of additional variables created if the model involves interaction terms. |
A list with the following components
MallowsCp |
of the Model. |
library(Dyn4cast) ctl <- c(4.17, 5.58, 5.18, 6.11, 4.50, 4.61, 5.17, 4.53, 5.33, 5.14) trt <- c(4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69) x <- gl(2, 10, 20, labels = c("Ctl", "Trt")) y <- c(ctl, trt) Model <- lm(y ~ x) Type <- "LM" MallowsCp(model2 = Model, y = y, x = x, type = Type, Nlevels = 0)library(Dyn4cast) ctl <- c(4.17, 5.58, 5.18, 6.11, 4.50, 4.61, 5.17, 4.53, 5.33, 5.14) trt <- c(4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69) x <- gl(2, 10, 20, labels = c("Ctl", "Trt")) y <- c(ctl, trt) Model <- lm(y ~ x) Type <- "LM" MallowsCp(model2 = Model, y = y, x = x, type = Type, Nlevels = 0)
This function computes the indices and all associated measures of
multidimensional poverty sequentially in a dynamic way. Sequentially
the function computes Incidence of poverty (H = q / n),
Adjusted incidence of poverty (H / (q / D)), Deprivation Score of each
dimension in the computation, Intensity of poverty (A),
Multidimensional poverty index (MDPI = H * A), the Contribution in
% of each of the dimensions to MDPI, and
Average deprivation among the deprived (A * D). Dynamically, it
computes the various indices for between three and nine dimensions (D).
The first five dimensions included in the computations are Health,
Education, Living standard, Social security and,
Employment and Income depending on the choice of the user. Four
additional dimensions can be included in the computations. The
computations are carried out either for the national sample data or
can be dis-aggregated based on grouping factors, like region, sex,
gender, marital status or any suitable one. The cut-off mark
demarcating poor (q) and non-poor (n-q) members in the sample (n)
is defaulted to 0.4 but can be varied as may be dictated by the
interests or the need for the computation. The computations are in
line with various procedures already outlined in literature starting
with the work of Alkire et. al, (2015) but has been expanded from
three dimensions to nine. Each dimension is given equal weight in
the computation but all indicators are weighted in line with
existing guidelines in Alkire & Foster (2011) and Alkire & Santos
(2010). See also Alkire & Santos (2014) and Chan & Wong (2024).
mdpi( data, dm, Bar = 0.4, id_addn = NULL, Factor = NULL, plots = NULL, id = c("Health", "Education", "Living standard"), id_add = "Social security", id_add1 = "Employment and Income", Echo = TRUE )mdpi( data, dm, Bar = 0.4, id_addn = NULL, Factor = NULL, plots = NULL, id = c("Health", "Education", "Living standard"), id_add = "Social security", id_add1 = "Employment and Income", Echo = TRUE )
data |
|
dm |
list of vectors of indicators making up each dimension to be computed |
Bar |
an optional vector of cut-of used to divide the population into those in the poverty category and those that are not. Defaults to 0.4 if not supplied. |
id_addn |
an optional vector of additional dimensions to be used for the computation up to a maximum of four. |
Factor |
an optional grouping factor for the computation which must be a variable in the data. If not supplied, only the national MDPI will be computed. |
plots |
plots of the various measures. For this to be possible, the
number of options in the |
id |
a vector of the first three dimensions used in the computation
given as Health, Education and Living standard. Can be redefined
but must match the indicators and cannot be |
id_add |
a vector of the fourth dimension in the computation given
as Social security. Can be re-defined but never |
id_add1 |
a vector of the fifth dimension in the computation given
as Employment and Income. Can be re-defined but never |
Echo |
Optional indicating whether the progress note is visible defaults to TRUE. |
A list with the following components:
MDPI_p |
Publication-ready table of the factor and national
MDPI prepared with |
MDPI |
|
MDPI mean |
|
MDPI SD |
|
national |
|
dimensions |
|
Score |
|
Alkire, S. & Foster, J. (2011). Counting and Multidimensional Poverty Measurement. Journal of Public Economics 95(7-8): 476–87. https://doi.org/10.1016/j.jpubeco.2010.11.006.
Alkire, S., Foster, J. E., Seth, S., Santos, M. E., Roche, J., & Ballon, P. (2015). Multidimensional poverty measurement and analysis. Oxford University Press.
Alkire, S. & Santos, M. E. (2010). Acute Multidimensional Poverty: A New Index for Developing Countries. Oxford Poverty and Human Development Initiative (OPHI) Working Paper No. 38.
Alkire, S. & Santos, M. E. (2014). Measuring Acute Poverty in the Developing World: Robustness and Scope of the Multidimensional Poverty Index. World Development 59:251-274. https://doi.org/10.1016/j.worlddev.2014.01.026.
Siu Ming Chan & Hung Wong (2024): Measurement and determinants of multidimensional poverty: the case of Hong Kong, Journal of Asian Public Policy, DOI: 10.1080/17516234.2024.2325857
# # Not run, uncomment to run # # data from `MPI` package # data <- mdpi1 # dm <- list(d1 = c("Child.Mortality", "Access.to.health.care"), # d2 = c("Years.of.education", "School.attendance", "School.lag"), # d3 = c("Cooking.Fuel", "Access.to.clean.source.of.water", # "Access.to.an.improve.sanatation", "Electricity", # "Housing.Materials", "Asset.ownership")) # mdpi(data, dm, plots = "t", Factor = "Region") # mdpi(data, dm, plots = "t") # # # data from `mpitbR` package # data <- mdpi2 # dm <- list(d1 = c("d_nutr","d_cm"), # d2 = c("d_satt","d_educ"), # d3 = c("d_elct","d_sani","d_wtr","d_hsg","d_ckfl","d_asst")) # mdpi(data, dm, plots = "t", Factor = "region") # mdpi(data, dm, plots = "t")# # Not run, uncomment to run # # data from `MPI` package # data <- mdpi1 # dm <- list(d1 = c("Child.Mortality", "Access.to.health.care"), # d2 = c("Years.of.education", "School.attendance", "School.lag"), # d3 = c("Cooking.Fuel", "Access.to.clean.source.of.water", # "Access.to.an.improve.sanatation", "Electricity", # "Housing.Materials", "Asset.ownership")) # mdpi(data, dm, plots = "t", Factor = "Region") # mdpi(data, dm, plots = "t") # # # data from `mpitbR` package # data <- mdpi2 # dm <- list(d1 = c("d_nutr","d_cm"), # d2 = c("d_satt","d_educ"), # d3 = c("d_elct","d_sani","d_wtr","d_hsg","d_ckfl","d_asst")) # mdpi(data, dm, plots = "t", Factor = "region") # mdpi(data, dm, plots = "t")
This function estimates over 40 Metrics for assessing the quality of Machine Learning Models. The purpose is to provide a wrapper which brings all the metrics on the table and makes it easier to use them to select a model.
MLMetrics(Observed, yvalue, modeli, K, Name, Form, kutuf, TTy)MLMetrics(Observed, yvalue, modeli, K, Name, Form, kutuf, TTy)
Observed |
The Observed data in a data frame format |
yvalue |
The Response variable of the estimated Model |
modeli |
The Estimated Model (Model = a + bx) |
K |
The number of variables in the estimated Model to consider |
Name |
The Name of the Models that need to be specified. They are ARIMA, Values if the model computes the fitted value without estimation like Essembles, SMOOTH (smooth.spline), Logit, Ensembles based on weight - EssemWet, QUADRATIC polynomial, SPLINE polynomial. |
Form |
Form of the Model Estimated (LM, ALM, GLM, N-LM, ARDL) |
kutuf |
Cutoff for the Estimated values (defaults to 0.5 if not specified) |
TTy |
Type of response variable (Numeric or Response - like binary) |
A list with the following components:
Absolute Error |
of the Model. |
Absolute Percent Error |
of the Model. |
Accuracy |
of the Model. |
Adjusted R Square |
of the Model. |
`Akaike's` Information Criterion AIC |
of the Model. |
Area under the ROC curve (AUC) |
of the Model. |
Average Precision at k |
of the Model. |
Bias |
of the Model. |
Brier score |
of the Model. |
Classification Error |
of the Model. |
F1 Score |
of the Model. |
fScore |
of the Model. |
GINI Coefficient |
of the Model. |
kappa statistic |
of the Model. |
Log Loss |
of the Model. |
`Mallow's` cp |
of the Model. |
Matthews Correlation Coefficient |
of the Model. |
Mean Log Loss |
of the Model. |
Mean Absolute Error |
of the Model. |
Mean Absolute Percent Error |
of the Model. |
Mean Average Precision at k |
of the Model. |
Mean Absolute Scaled Error |
of the Model. |
Median Absolute Error |
of the Model. |
Mean Squared Error |
of the Model. |
Mean Squared Log Error |
of the Model. |
Model turning point error |
of the Model. |
Negative Predictive Value |
of the Model. |
Percent Bias |
of the Model. |
Positive Predictive Value |
of the Model. |
Precision |
of the Model. |
Predictive Residual Sum of Squares |
of the Model. |
R Square |
of the Model. |
Relative Absolute Error |
of the Model. |
Recall |
of the Model. |
Root Mean Squared Error |
of the Model. |
Root Mean Squared Log Error |
of the Model. |
Root Relative Squared Error |
of the Model. |
Relative Squared Error |
of the Model. |
`Schwarz's` Bayesian criterion BIC |
of the Model. |
Sensitivity |
of the Model. |
specificity |
of the Model. |
Squared Error |
of the Model. |
Squared Log Error |
of the Model. |
Symmetric Mean Absolute Percentage Error |
of the Model. |
Sum of Squared Errors |
of the Model. |
True negative rate |
of the Model. |
True positive rate |
of the Model. |
library(splines) library(readr) Model <- lm(states ~ bs(sequence, knots = c(30, 115)), data = Data) MLMetrics(Observed = Data, yvalue = Data$states, modeli = Model, K = 2, Name = "Linear", Form = "LM", kutuf = 0, TTy = "Number")library(splines) library(readr) Model <- lm(states ~ bs(sequence, knots = c(30, 115)), data = Data) MLMetrics(Observed = Data, yvalue = Data$states, modeli = Model, K = 2, Name = "Linear", Form = "LM", kutuf = 0, TTy = "Number")
This function retrieves the latent factors and their variable loadings which
can be used as R objects to perform other analysis.
model_factors(data, DATA)model_factors(data, DATA)
data |
An |
DATA |
A |
A list with the following components:
Loadings data |
|
Factors extracted |
|
factored data |
|
Factors list |
A list of vectors of individual latent factors
recovered from the data. However, to make it usable, the vector should
be |
Resilence capacity |
A vector of the resilience capacity. |
library(psych) library(readr) Data <- Quicksummary GGn <- names(Data) GG <- ncol(Data) GGx <- c(paste0('x0', 1 : 9), paste("x", 10 : ncol(Data), sep = "")) names(Data) <- GGx lll <- fa.parallel(Data, fm = "minres", fa = "fa") dat <- fa(Data, nfactors = lll[["nfact"]], rotate = "varimax",fm = "minres") model_factors(data = dat, DATA = Data)library(psych) library(readr) Data <- Quicksummary GGn <- names(Data) GG <- ncol(Data) GGx <- c(paste0('x0', 1 : 9), paste("x", 10 : ncol(Data), sep = "")) names(Data) <- GGx lll <- fa.parallel(Data, fm = "minres", fa = "fa") dat <- fa(Data, nfactors = lll[["nfact"]], rotate = "varimax",fm = "minres") model_factors(data = dat, DATA = Data)
This function computes odds ratios, percentage changes, and confidence intervals from fitted binary and categorical regression models. It standardizes statistical inference outputs and highlights significant predictors for rapid interpretation. It is a one-line, one-argument code!
odds_summary(model)odds_summary(model)
model |
An |
A list or a data.frame depending on which model. The model must
converged otherwise there will be no any return and an error is thrown up
# library(Dyn4cast) # uncomment to run # library(tidyverse) # counts <- c(18,17,15,20,10,20,25,13,12) # outcome <- gl(3,1,9) # treatment <- gl(3,3) # ddc <- data.frame(treatment, outcome, counts) # showing data # glm.D93 <- glm(counts ~ ., data = ddc, family = poisson()) # odds_summary(glm.D93) # library(MASS) # anorexia # anorex.1 <- glm(Postwt ~ Prewt + Treat + offset(Prewt), # family = gaussian, data = anorexia) # odds_summary(anorex.1) # clotting <- data.frame( # u = c(5,10,15,20,30,40,60,80,100), # lot1 = c(118,58,42,35,27,25,21,19,18), # lot2 = c(69,35,26,21,18,16,13,12,12)) # lot1 <- glm(lot1 ~ log(u), data = clotting, family = Gamma) # odds_summary(lot1) # lot2 <- glm(lot2 ~ log(u), data = clotting, family = Gamma) # odds_summary(lot2) # fS <- glm(lot2 ~ log(u) + log(u^2), data = clotting, family = Gamma) # #odds_summary(fS) #error because there is no convergence # x <- rnorm(100) # y <- rpois(100, exp(1+x)) # lm2 <- glm(y ~ x, family = quasi(variance = "mu", link = "log")) # odds_summary(lm2) # lm3 <- glm(y ~ x, family = poisson) # odds_summary(lm3) # lm4 <- glm(y ~ x, family = quasi(variance = "mu^2", link = "log")) # #odds_summary(lm4) #error # y <- rbinom(100, 1, plogis(x)) # lm5 <- glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"), # start = c(0,1)) # odds_summary(lm5) # library(betareg) # data("GasolineYield") # gy <- betareg(yield ~ batch + temp, data = GasolineYield) # odds_summary(gy) # library(mvProbit) # ## generate a simulated data set # set.seed( 123 ) # # number of observations # nObs <- 50 # # generate explanatory variables # xMat <- cbind(const = rep(1, nObs), x1 = as.numeric(rnorm(nObs) > 0), # x2 = rnorm(nObs)) # # model coefficients # beta <- cbind(c(0.8, 1.2, -0.8), c(-0.6, 1.0, -1.6), c(0.5, -0.6, 1.2)) # # covariance matrix of error terms # library(miscTools) # sigma <- symMatrix(c(1, 0.2, 0.4, 1, -0.1, 1)) # # generate dependent variables # yMatLin <- xMat %*% beta # yMat <- (yMatLin + rmvnorm(nObs, sigma = sigma)) > 0 # colnames(yMat) <- paste("y", 1:3, sep = "") # estResultStart <- mvProbit(cbind(y1, y2, y3) ~ x1 + x2, start = c(beta), # startSigma = sigma, data = as.data.frame(cbind(xMat, yMat)), iterlim = 1, # nGHK = 50) # odds_summary(estResultStart) ## library(Dyn4cast) # uncomment to run # library(tidyverse) # counts <- c(18,17,15,20,10,20,25,13,12) # outcome <- gl(3,1,9) # treatment <- gl(3,3) # ddc <- data.frame(treatment, outcome, counts) # showing data # glm.D93 <- glm(counts ~ ., data = ddc, family = poisson()) # odds_summary(glm.D93) # library(MASS) # anorexia # anorex.1 <- glm(Postwt ~ Prewt + Treat + offset(Prewt), # family = gaussian, data = anorexia) # odds_summary(anorex.1) # clotting <- data.frame( # u = c(5,10,15,20,30,40,60,80,100), # lot1 = c(118,58,42,35,27,25,21,19,18), # lot2 = c(69,35,26,21,18,16,13,12,12)) # lot1 <- glm(lot1 ~ log(u), data = clotting, family = Gamma) # odds_summary(lot1) # lot2 <- glm(lot2 ~ log(u), data = clotting, family = Gamma) # odds_summary(lot2) # fS <- glm(lot2 ~ log(u) + log(u^2), data = clotting, family = Gamma) # #odds_summary(fS) #error because there is no convergence # x <- rnorm(100) # y <- rpois(100, exp(1+x)) # lm2 <- glm(y ~ x, family = quasi(variance = "mu", link = "log")) # odds_summary(lm2) # lm3 <- glm(y ~ x, family = poisson) # odds_summary(lm3) # lm4 <- glm(y ~ x, family = quasi(variance = "mu^2", link = "log")) # #odds_summary(lm4) #error # y <- rbinom(100, 1, plogis(x)) # lm5 <- glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"), # start = c(0,1)) # odds_summary(lm5) # library(betareg) # data("GasolineYield") # gy <- betareg(yield ~ batch + temp, data = GasolineYield) # odds_summary(gy) # library(mvProbit) # ## generate a simulated data set # set.seed( 123 ) # # number of observations # nObs <- 50 # # generate explanatory variables # xMat <- cbind(const = rep(1, nObs), x1 = as.numeric(rnorm(nObs) > 0), # x2 = rnorm(nObs)) # # model coefficients # beta <- cbind(c(0.8, 1.2, -0.8), c(-0.6, 1.0, -1.6), c(0.5, -0.6, 1.2)) # # covariance matrix of error terms # library(miscTools) # sigma <- symMatrix(c(1, 0.2, 0.4, 1, -0.1, 1)) # # generate dependent variables # yMatLin <- xMat %*% beta # yMat <- (yMatLin + rmvnorm(nObs, sigma = sigma)) > 0 # colnames(yMat) <- paste("y", 1:3, sep = "") # estResultStart <- mvProbit(cbind(y1, y2, y3) ~ x1 + x2, start = c(beta), # startSigma = sigma, data = as.data.frame(cbind(xMat, yMat)), iterlim = 1, # nGHK = 50) # odds_summary(estResultStart) #
This function is a wrapper for easy affixing of the per cent sign (%) to a value or a vector or a data frame of values.
Percent(Data, Type, format = "f", ...)Percent(Data, Type, format = "f", ...)
Data |
The Data which the percent sign is to be affixed. The data must be in the raw form for frame argument since the per cent value of each cell is calculated before the sign is affixed. |
Type |
The type of data. The options for this argument are Value for scalar or vector numeric data or Frame for a numeric vector or data.frame data. In the case of frame, whether vector or columns, the per cent value of each cell is calculated before the per cent sign is affixed. |
format |
The format of the output which is internal and the default is a character factor |
... |
Additional arguments that may be passed to the function |
This function returns the result as
percent |
values with the percentage sign (%) affixed. |
Data <- c(1.2, 0.5, 0.103, 7, 0.1501) Percent(Data = Data, Type = "Frame") # Value, Frame Data <- 1.2 Percent(Data = Data, Type = "Value") # Value, Frame df <- data.frame(c(A = 2320, 5760, 4800, 2600, 5700, 7800, 3000, 6300, 2400, 10000, 2220, 3740), B = c(0, 0, 1620, 3600, 1200, 1200, 1200, 4250, 14000, 10000, 1850, 1850), C = c(3000, 3000, 7800, 5400, 3900, 7800, 1950, 2400, 2400, 7000, 1850, 1850), D = c(2900, 5760, 3750, 5400, 4095, 3150, 2080, 7800, 1920, 1200, 5000, 1950), E = c(2900, 2030, 0, 5400, 5760, 1800, 2000, 1950, 1850, 3600, 5200, 5760), F = c(2800, 5760, 1820, 4340, 7500, 2400, 2300, 1680, 1850, 0, 2800, 8000), G = c(5760, 4600, 13000, 7800, 6270, 1200, 1440, 8000, 1200, 2025, 4800, 2600), H = c(2100, 5760, 8250, 3900, 1800, 1200, 4800, 1800, 7800, 2035, 8000, 3000)) Percent(Data = df, Type = "Frame") # Value, FrameData <- c(1.2, 0.5, 0.103, 7, 0.1501) Percent(Data = Data, Type = "Frame") # Value, Frame Data <- 1.2 Percent(Data = Data, Type = "Value") # Value, Frame df <- data.frame(c(A = 2320, 5760, 4800, 2600, 5700, 7800, 3000, 6300, 2400, 10000, 2220, 3740), B = c(0, 0, 1620, 3600, 1200, 1200, 1200, 4250, 14000, 10000, 1850, 1850), C = c(3000, 3000, 7800, 5400, 3900, 7800, 1950, 2400, 2400, 7000, 1850, 1850), D = c(2900, 5760, 3750, 5400, 4095, 3150, 2080, 7800, 1920, 1200, 5000, 1950), E = c(2900, 2030, 0, 5400, 5760, 1800, 2000, 1950, 1850, 3600, 5200, 5760), F = c(2800, 5760, 1820, 4340, 7500, 2400, 2300, 1680, 1850, 0, 2800, 8000), G = c(5760, 4600, 13000, 7800, 6270, 1200, 1440, 8000, 1200, 2025, 4800, 2600), H = c(2100, 5760, 8250, 3900, 1800, 1200, 4800, 1800, 7800, 2035, 8000, 3000)) Percent(Data = df, Type = "Frame") # Value, Frame
Plots of Multidimensional Poverty Measures
plot_mdpi(data, kala, dma, factor = NULL)plot_mdpi(data, kala, dma, factor = NULL)
data |
|
kala |
color palette with at least 15 colors but must be equal or higher than the number of options in the factor argument |
dma |
number of |
factor |
the optional grouping factor used in the computation measures. If not supplied only the national plots will be produced irrespective of whether the factor was used in the computation. |
A list of the following plots:
Multidimensional poverty index |
plot. |
Deprivation Score |
plot. |
Adjusted incidence of poverty |
plot. |
Intensity of poverty |
plot. |
Average deprivation among the deprived |
plot. |
Contribution of each Dimension |
plot. |
combined dimensions |
plot. |
national |
plot. |
combined dimensions of national |
plot. |
# Not run, uncomment to run # library(MPI) # data("examplePovertydf") # data <- examplePovertydf # dm <- list(d1 = c("Child.Mortality", "Access.to.health.care"), # d2 = c("Years.of.education", "School.attendance", "School.lag"), # d3 = c("Cooking.Fuel", "Access.to.clean.source.of.water", # "Access.to.an.improve.sanatation", "Electricity", # "Housing.Materials", "Asset.ownership")) # dp <- mdpi(data, dm, Factor = "Region") # library(MetBrewer) # kala <- met.brewer("OKeeffe1", 15, type = "continuous") # dma <- 3 # plot_mdpi(dp$MDPI, kala, dma, "Region")# Not run, uncomment to run # library(MPI) # data("examplePovertydf") # data <- examplePovertydf # dm <- list(d1 = c("Child.Mortality", "Access.to.health.care"), # d2 = c("Years.of.education", "School.attendance", "School.lag"), # d3 = c("Cooking.Fuel", "Access.to.clean.source.of.water", # "Access.to.an.improve.sanatation", "Electricity", # "Housing.Materials", "Asset.ownership")) # dp <- mdpi(data, dm, Factor = "Region") # library(MetBrewer) # kala <- met.brewer("OKeeffe1", 15, type = "continuous") # dma <- 3 # plot_mdpi(dp$MDPI, kala, dma, "Region")
There is increasing need to make user-friendly and production ready Tables for machine learning data. This function is simplified and quick summary; and the output is a formatted table. This is very handy for those who do not have the time to write codes for user-friendly summaries.
quicksummary( x, Type, Cut = deprecated(), Up = deprecated(), Down = deprecated(), Dig = 2, ci = 0.95 )quicksummary( x, Type, Cut = deprecated(), Up = deprecated(), Down = deprecated(), Dig = 2, ci = 0.95 )
x |
The data to be summarised. Only numeric data is allowed. |
Type |
The type of data to be summarised. There are two options here 1
or 2, 1 = |
Cut |
|
Up |
|
Down |
|
Dig |
Number of significant digits which is defaults to 2. |
ci |
Confidence interval which is defaults to 0.95. |
The function returns formatted tables of the Quick summary
Summary |
List of two |
library(tidyverse) # Likert-type data quicksummary(x = Quicksummary, Type = 2) # Continuous data x <- select(linearsystems, 1:6) quicksummary(x = x, Type = 1)library(tidyverse) # Likert-type data quicksummary(x = Quicksummary, Type = 2) # Continuous data x <- select(linearsystems, 1:6) quicksummary(x = x, Type = 1)
Adaptive capacity refers to the ability of systems—biological, social, or institutional—to adjust to environmental changes, capitalize on emerging opportunities, and mitigate potential threats in order to preserve essential functions. In the context of climate change, adaptive capacity denotes the competence of social-ecological systems to cope with present variability and prepare for uncertain future conditions.
From a machine learning perspective, adaptive capacity is closely linked to the system’s ability to process large-scale, heterogeneous data sources, identify patterns, and support the development of predictive models and adaptive strategies. Key components of adaptive capacity include access to relevant and reliable information, computational infrastructure, financial and human capital, and strong social and institutional networks.
Machine learning can enhance adaptive capacity by enabling dynamic learning from historical and real-time data, improving climate risk assessments, and optimizing adaptation strategies. Moreover, the iterative nature of model refinement and feedback integration mirrors the learning processes inherent in adaptive systems. Thus, adaptive capacity in this context involves not only the ability to design and implement effective interventions but also to learn from outcomes and continuously update strategies in light of new data and evolving conditions.
This function is for knowledge-based Adaptive Capacity. The indices from
the various knowledge areas like Awareness, availability, affordability,
accessibility, benefits, adequacy, usage, effectiveness, etc can be
obtained individually and converted to adaptive capacity. Adaptive
capacity based on financial and human capital, strong social and
institutional networks can be obtained from model_factors().
relative_likert( data, Likert = NULL, Ranks = NULL, Option = "text", Echo = TRUE )relative_likert( data, Likert = NULL, Ranks = NULL, Option = "text", Echo = TRUE )
data |
Data frame of likert data either in text or scores. |
Likert |
Vector of likert-type factors in descending order as in the data frame which must be given if the data frame is in text. |
Ranks |
Optional vector of number of levels which is required if the data frame is in scores rather than text. There are only four choices i.e. 3, 5, 7, 9. |
Option |
Optional vector indicating whether the data frame is in text or scores format. Defaults to text if not given. |
Echo |
Optional indicating whether the progress note is visible defaults to TRUE. |
A list with the following components:
Likert-scores |
|
Relative-likert-scores |
|
Summary-of-relative-scores |
|
Vector-of-indices |
A vector of indices for Adaptive Capacity. |
library(readr) garrett_data <- data.frame(garrett_data) relative_likert(garrett_data, Ranks = 3, Option = "sccore") relative_likert(garrett_data, Ranks = 5, Option = "sccore") relative_likert(garrett_data, Ranks = 7, Option = "sccore") relative_likert(garrett_data, Ranks = 9, Option = "sccore") relative_likert(Quicksummary, Ranks = 5, Option = "sccore") library(tidyverse) data_l <- garrett_data %>% pivot_longer(cols = everything()) %>% mutate(value = case_when(value == 5 ~ "Serious constraint", value == 4 ~ "Constraint", value == 3 ~ "Not certain it is a constraint", value == 2 ~ "Not a constraint", value == 1 ~ "Not a serious constraint", .default = "None")) %>% group_by(name) %>% mutate(row = row_number()) %>% pivot_wider(names_from = name, values_from = value) %>% select(-row) %>% unnest(cols = everything()) ranking <- c("Serious constraint", "Constraint", "Not certain it is a constraint", "Not a constraint", "Not a serious constraint") relative_likert(data_l, Likert = ranking)library(readr) garrett_data <- data.frame(garrett_data) relative_likert(garrett_data, Ranks = 3, Option = "sccore") relative_likert(garrett_data, Ranks = 5, Option = "sccore") relative_likert(garrett_data, Ranks = 7, Option = "sccore") relative_likert(garrett_data, Ranks = 9, Option = "sccore") relative_likert(Quicksummary, Ranks = 5, Option = "sccore") library(tidyverse) data_l <- garrett_data %>% pivot_longer(cols = everything()) %>% mutate(value = case_when(value == 5 ~ "Serious constraint", value == 4 ~ "Constraint", value == 3 ~ "Not certain it is a constraint", value == 2 ~ "Not a constraint", value == 1 ~ "Not a serious constraint", .default = "None")) %>% group_by(name) %>% mutate(row = row_number()) %>% pivot_wider(names_from = name, values_from = value) %>% select(-row) %>% unnest(cols = everything()) ranking <- c("Serious constraint", "Constraint", "Not certain it is a constraint", "Not a constraint", "Not a serious constraint") relative_likert(data_l, Likert = ranking)
This function is a wrapper for scaling the fitted (predicted) values of a one-sided (positive or negative only) integer response variable of supported models. The scaling involves some log transformation of the fitted (predicted) values.
scaledlogit(x, x2, lower, upper)scaledlogit(x, x2, lower, upper)
x |
|
x2 |
The parameter to be scaled, which is the fitted values from supported models. The scaled parameter is used mainly for constrained forecasting of a response variable positive (0 - inf) or negative (-inf - 0). The scaling involves log transformation of the parameter |
lower |
Integer or variable representing the lower limit for the scaling (-inf or 0) |
upper |
Integer or variable representing the upper limit for the scaling (0 or inf) |
library(Dyn4cast) library(splines) lower <- 1 upper <- 37 Model <- lm(states ~ bs(sequence, knots = c(30, 115)), data = Data) scaledlogit(x2 = fitted.values(Model), lower = lower, upper = upper)library(Dyn4cast) library(splines) lower <- 1 upper <- 37 Model <- lm(states ~ bs(sequence, knots = c(30, 115)), data = Data) scaledlogit(x2 = fitted.values(Model), lower = lower, upper = upper)
Observational study involves the evaluation of outcomes of participants not randomly assigned treatments or exposures. To be able to assess the effects of the outcome, the participants are matched using propensity scores (PSM). This then enables the determination of the effects of the treatments on those treated against those who were not treated. Most of the earlier functions available for this analysis only enables the determination of the average treatments effects on the treated (ATT) while the other treatment effects are optional. This is where this functions is unique because five different average treatment effects are estimated simultaneously, in spite of the one line code arguments. The five treatment effects are:
Average treatment effect for the entire (ATE) population
Average treatment effect for the treated (ATT) population
Average treatment effect for the controlled (ATC) population
Average treatment effect for the evenly matched (ATM) population
Average treatment effect for the overlap (ATO) population.
There are excellent materials dealing with each of the treatment effects, please see Understanding propensity score weighting
treatment_model(Treatment, x_data)treatment_model(Treatment, x_data)
Treatment |
Vector of binary data (0 = control population, 1 = treated population), the LHS of the model for treatment effects estimation |
x_data |
Data frame of explanatory variables for the RHS of the model for treatment effects estimation |
A list with the following components:
Model |
Estimated treatment effects model. |
Effect |
Data frame of the estimated various treatment effects. |
P_score |
Vector of estimated propensity scores from the model |
Fitted_estimate |
Vector of fitted values from the model |
Residuals |
Residuals of the estimated model |
`Experiment plot` |
Plot of the propensity scores from the model faceted into Treated and control populations |
`ATE plot` |
Plot of the average treatment effect for the entire population |
`ATT plot` |
Plot of the average treatment effect for the treated population |
`ATC plot` |
Plot of the average treatment effect for the controlled population |
`ATM plot` |
Plot of the average Treatment effect for the evenly population |
`ATO plot` |
Plot of the average Treatment effect for the overlap population |
weights |
Estimated weights for each of the treatment effects |
library(readr) Treatment = treatments$treatment data = treatments[, c(2:3)] treatment_model(Treatment, data)library(readr) Treatment = treatments$treatment data = treatments[, c(2:3)] treatment_model(Treatment, data)