Title: | Non-Smooth Regularization for Structural Equation Models |
---|---|
Description: | Provides regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on 'lavaan'. The package is heavily inspired by the ['regsem'](<https://github.com/Rjacobucci/regsem>) and ['lslx'](<https://github.com/psyphh/lslx>) packages. |
Authors: | Jannik H. Orzek [aut, cre, cph] |
Maintainer: | Jannik H. Orzek <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.5.5 |
Built: | 2024-11-04 05:08:41 UTC |
Source: | https://github.com/jhorzek/lesssem |
wls needs smaller breaking points than ml
.adaptBreakingForWls(lavaanModel, currentBreaking, selectedDefault)
.adaptBreakingForWls(lavaanModel, currentBreaking, selectedDefault)
lavaanModel |
single model or vector of models |
currentBreaking |
current breaking condition value |
selectedDefault |
was default breaking condition selected? |
updated breaking
Internal function to check a mixedPenalty object
.checkPenalties(mixedPenalty)
.checkPenalties(mixedPenalty)
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addLasso, addLsp, addMcp, and addScad functions. |
Adds labels to unlabeled parameters in the lavaan parameter table. Also removes fixed parameters.
.labelLavaanParameters(lavaanModel)
.labelLavaanParameters(lavaanModel)
lavaanModel |
fitted lavaan model |
parameterTable with labeled parameters
updates a lavaan model. lavaan has an update function that does exactly that, but it seems to not work with testthat. This is an attempt to hack around the issue...
.updateLavaan(lavaanModel, key, value)
.updateLavaan(lavaanModel, key, value)
lavaanModel |
fitted lavaan model |
key |
label of the element that should be updated |
value |
new value for the updated element |
lavaan model
Internal function checking if elastic net is used
.useElasticNet(mixedPenalty)
.useElasticNet(mixedPenalty)
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addLasso, addLsp, addMcp, and addScad functions. |
TRUE if elastic net, FALSE otherwise
Implements adaptive lasso regularization for structural equation models. The penalty function is given by:
Adaptive lasso regularization will set parameters to zero if
is large enough.
adaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
adaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- adaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), # in case of lasso and adaptive lasso, we can specify the number of lambda # values to use. lessSEM will automatically find lambda_max and fit # models for nLambda values between 0 and lambda_max. For the other # penalty functions, lambdas must be specified explicitly nLambdas = 50) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- adaptiveLasso( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), nLambdas = 50, method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- adaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), # in case of lasso and adaptive lasso, we can specify the number of lambda # values to use. lessSEM will automatically find lambda_max and fit # models for nLambda values between 0 and lambda_max. For the other # penalty functions, lambdas must be specified explicitly nLambdas = 50) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- adaptiveLasso( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), nLambdas = 50, method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
Implements cappedL1 regularization for structural equation models. The penalty function is given by:
where . The cappedL1 penalty is identical to the lasso for
parameters which are below
and identical to a constant for parameters
above
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
addCappedL1(mixedPenalty, regularized, lambdas, thetas)
addCappedL1(mixedPenalty, regularized, lambdas, thetas)
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class mixedPenalty. Use the fit() - function to fit the model
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addCappedL1(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit()
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addCappedL1(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit()
Adds an elastic net penalty to specified parameters. The penalty function is given by:
Note that the elastic net combines ridge and lasso regularization. If ,
the elastic net reduces to ridge regularization. If
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
addElasticNet(mixedPenalty, regularized, alphas, lambdas, weights = 1)
addElasticNet(mixedPenalty, regularized, alphas, lambdas, weights = 1)
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
alphas |
numeric vector: values for the tuning parameter alpha. Set to 1 for lasso and to zero for ridge. Anything in between is an elastic net penalty. |
lambdas |
numeric vector: values for the tuning parameter lambda |
weights |
can be used to give different weights to the different parameters |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class mixedPenalty. Use the fit() - function to fit the model
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addElasticNet(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), alphas = .4) |> # fit the model: fit()
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addElasticNet(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), alphas = .4) |> # fit the model: fit()
Implements lasso regularization for structural equation models. The penalty function is given by:
Lasso regularization will set parameters to zero if is large enough
addLasso(mixedPenalty, regularized, weights = 1, lambdas)
addLasso(mixedPenalty, regularized, weights = 1, lambdas)
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
can be used to give different weights to the different parameters |
lambdas |
numeric vector: values for the tuning parameter lambda |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class mixedPenalty. Use the fit() - function to fit the model
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addLasso(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1)) |> # fit the model: fit()
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addLasso(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1)) |> # fit the model: fit()
Implements lsp regularization for structural equation models. The penalty function is given by:
where .
addLsp(mixedPenalty, regularized, lambdas, thetas)
addLsp(mixedPenalty, regularized, lambdas, thetas)
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class mixedPenalty. Use the fit() - function to fit the model
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addLsp(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit()
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addLsp(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit()
Implements mcp regularization for structural equation models. The penalty function is given by:
where .
addMcp(mixedPenalty, regularized, lambdas, thetas)
addMcp(mixedPenalty, regularized, lambdas, thetas)
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class mixedPenalty. Use the fit() - function to fit the model
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addMcp(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit()
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addMcp(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit()
Implements scad regularization for structural equation models. The penalty function is given by:
where .
addScad(mixedPenalty, regularized, lambdas, thetas)
addScad(mixedPenalty, regularized, lambdas, thetas)
mixedPenalty |
model of class mixedPenalty created with the mixedPenalty function (see ?mixedPenalty) |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class mixedPenalty. Use the fit() - function to fit the model
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addScad(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 3.1) |> # fit the model: fit()
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addScad(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 3.1) |> # fit the model: fit()
returns the AIC
## S4 method for signature 'gpRegularized' AIC(object, ..., k = 2)
## S4 method for signature 'gpRegularized' AIC(object, ..., k = 2)
object |
object of class gpRegularized |
... |
not used |
k |
multiplier for number of parameters |
data frame with fit values, appended with AIC
AIC
## S4 method for signature 'Rcpp_mgSEM' AIC(object, ..., k = 2)
## S4 method for signature 'Rcpp_mgSEM' AIC(object, ..., k = 2)
object |
object of class Rcpp_mgSEM |
... |
not used |
k |
multiplier for number of parameters |
AIC values
AIC
## S4 method for signature 'Rcpp_SEMCpp' AIC(object, ..., k = 2)
## S4 method for signature 'Rcpp_SEMCpp' AIC(object, ..., k = 2)
object |
object of class Rcpp_SEMCpp |
... |
not used |
k |
multiplier for number of parameters |
AIC values
returns the AIC
## S4 method for signature 'regularizedSEM' AIC(object, ..., k = 2)
## S4 method for signature 'regularizedSEM' AIC(object, ..., k = 2)
object |
object of class regularizedSEM |
... |
not used |
k |
multiplier for number of parameters |
AIC values
returns the AIC
## S4 method for signature 'regularizedSEMMixedPenalty' AIC(object, ..., k = 2)
## S4 method for signature 'regularizedSEMMixedPenalty' AIC(object, ..., k = 2)
object |
object of class regularizedSEMMixedPenalty |
... |
not used |
k |
multiplier for number of parameters |
AIC values
This function allows for optimizing models built in lavaan using the BFGS optimizer implemented in lessSEM. Its elements can be accessed with the "@" operator (see examples). The main purpose is to make transformations of lavaan models more accessible.
bfgs( lavaanModel, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
bfgs( lavaanModel, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. See ?controlBFGS for more details. |
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) lsem <- bfgs( # pass the fitted lavaan model lavaanModel = lavaanModel) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) lsem <- bfgs( # pass the fitted lavaan model lavaanModel = lavaanModel) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters
Object for smoothly approximated elastic net optimization with bfgs optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
Object for smoothly approximated elastic net optimization with bfgs optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
Object for smoothly approximated elastic net optimization with bfgs optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
returns the BIC
## S4 method for signature 'gpRegularized' BIC(object, ...)
## S4 method for signature 'gpRegularized' BIC(object, ...)
object |
object of class gpRegularized |
... |
not used |
data frame with fit values, appended with BIC
BIC
## S4 method for signature 'Rcpp_mgSEM' BIC(object, ...)
## S4 method for signature 'Rcpp_mgSEM' BIC(object, ...)
object |
object of class Rcpp_mgSEM |
... |
not used |
BIC values
BIC
## S4 method for signature 'Rcpp_SEMCpp' BIC(object, ...)
## S4 method for signature 'Rcpp_SEMCpp' BIC(object, ...)
object |
object of class Rcpp_SEMCpp |
... |
not used |
BIC values
returns the BIC
## S4 method for signature 'regularizedSEM' BIC(object, ...)
## S4 method for signature 'regularizedSEM' BIC(object, ...)
object |
object of class regularizedSEM |
... |
not used |
BIC values
returns the BIC
## S4 method for signature 'regularizedSEMMixedPenalty' BIC(object, ...)
## S4 method for signature 'regularizedSEMMixedPenalty' BIC(object, ...)
object |
object of class regularizedSEMMixedPenalty |
... |
not used |
BIC values
wrapper to call user defined fit function
callFitFunction(fitFunctionSEXP, parameters, userSuppliedElements)
callFitFunction(fitFunctionSEXP, parameters, userSuppliedElements)
fitFunctionSEXP |
pointer to fit function |
parameters |
vector with parameter values |
userSuppliedElements |
list with additional elements |
fit value (double)
Implements cappedL1 regularization for structural equation models. The penalty function is given by:
where . The cappedL1 penalty is identical to the lasso for
parameters which are below
and identical to a constant for parameters
above
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
cappedL1( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
cappedL1( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cappedL1( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(0.01,2,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cappedL1( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(0.01,2,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
Returns the parameter estimates of an cvRegularizedSEM
## S4 method for signature 'cvRegularizedSEM' coef(object, ...)
## S4 method for signature 'cvRegularizedSEM' coef(object, ...)
object |
object of class cvRegularizedSEM |
... |
not used |
the parameter estimates of an cvRegularizedSEM
Returns the parameter estimates of a gpRegularized
## S4 method for signature 'gpRegularized' coef(object, ...)
## S4 method for signature 'gpRegularized' coef(object, ...)
object |
object of class gpRegularized |
... |
criterion can be one of: "AIC", "BIC". If set to NULL, all parameters will be returned |
parameter estimates
coef
## S4 method for signature 'Rcpp_mgSEM' coef(object, ...)
## S4 method for signature 'Rcpp_mgSEM' coef(object, ...)
object |
object of class Rcpp_mgSEM |
... |
not used |
all coefficients of the model in transformed form
coef
## S4 method for signature 'Rcpp_SEMCpp' coef(object, ...)
## S4 method for signature 'Rcpp_SEMCpp' coef(object, ...)
object |
object of class Rcpp_SEMCpp |
... |
not used |
all coefficients of the model in transformed form
Returns the parameter estimates of a regularizedSEM
## S4 method for signature 'regularizedSEM' coef(object, ...)
## S4 method for signature 'regularizedSEM' coef(object, ...)
object |
object of class regularizedSEM |
... |
criterion can be one of the ones returned by fitIndices. If set to NULL, all parameters will be returned |
parameters of the model as data.frame
Returns the parameter estimates of a regularizedSEMMixedPenalty
## S4 method for signature 'regularizedSEMMixedPenalty' coef(object, ...)
## S4 method for signature 'regularizedSEMMixedPenalty' coef(object, ...)
object |
object of class regularizedSEMMixedPenalty |
... |
criterion can be one of: "AIC", "BIC". If set to NULL, all parameters will be returned |
parameters of the model as data.frame
Control the BFGS optimizer.
controlBFGS( startingValues = "est", initialHessian = ifelse(all(startingValues == "est"), "lavaan", "compute"), saveDetails = FALSE, stepSize = 0.9, sigma = 1e-05, gamma = 0, maxIterOut = 1000, maxIterIn = 1000, maxIterLine = 500, breakOuter = 1e-08, breakInner = 1e-10, convergenceCriterion = 0, verbose = 0, nCores = 1 )
controlBFGS( startingValues = "est", initialHessian = ifelse(all(startingValues == "est"), "lavaan", "compute"), saveDetails = FALSE, stepSize = 0.9, sigma = 1e-05, gamma = 0, maxIterOut = 1000, maxIterIn = 1000, maxIterLine = 500, breakOuter = 1e-08, breakInner = 1e-10, convergenceCriterion = 0, verbose = 0, nCores = 1 )
startingValues |
option to provide initial starting values. Only used for the first lambda. Three options are supported. Setting to "est" will use the estimates from the lavaan model object. Setting to "start" will use the starting values of the lavaan model. Finally, a labeled vector with parameter values can be passed to the function which will then be used as starting values. |
initialHessian |
option to provide an initial Hessian to the optimizer. Must have row and column names corresponding to the parameter labels. use getLavaanParameters(lavaanModel) to see those labels. If set to "gradNorm", the maximum of the gradients at the starting values times the stepSize will be used. This is adapted from Optim.jl https://github.com/JuliaNLSolvers/Optim.jl/blob/f43e6084aacf2dabb2b142952acd3fbb0e268439/src/multivariate/solvers/first_order/bfgs.jl#L104 If set to a single value, a diagonal matrix with the single value along the diagonal will be used. The default is "lavaan" which extracts the Hessian from the lavaanModel. This Hessian will typically deviate from that of the internal SEM represenation of lessSEM (due to the transformation of the variances), but works quite well in practice. |
saveDetails |
when set to TRUE, additional details about the individual models are save. Currently, this are the Hessian and the implied means and covariances. Note: This may take a lot of memory! |
stepSize |
Initial stepSize of the outer iteration (theta_next = theta_previous + stepSize * Stepdirection) |
sigma |
only relevant when lineSearch = 'GLMNET'. Controls the sigma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. |
gamma |
Controls the gamma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. Defaults to 0. |
maxIterOut |
Maximal number of outer iterations |
maxIterIn |
Maximal number of inner iterations |
maxIterLine |
Maximal number of iterations for the line search procedure |
breakOuter |
Stopping criterion for outer iterations |
breakInner |
Stopping criterion for inner iterations |
convergenceCriterion |
which convergence criterion should be used for the outer iterations? possible are 0 = GLMNET, 1 = fitChange, 2 = gradients. Note that in case of gradients and GLMNET, we divide the gradients (and the Hessian) of the log-Likelihood by N as it would otherwise be considerably more difficult for larger sample sizes to reach the convergence criteria. |
verbose |
0 prints no additional information, > 0 prints GLMNET iterations |
nCores |
number of core to use. Multi-core support is provided by RcppParallel and only supported for SEM, not for general purpose optimization. |
object of class controlBFGS
control <- controlBFGS()
control <- controlBFGS()
Control the GLMNET optimizer.
controlGlmnet( startingValues = "est", initialHessian = ifelse(all(startingValues == "est"), "lavaan", "compute"), saveDetails = FALSE, stepSize = 0.9, sigma = 1e-05, gamma = 0, maxIterOut = 1000, maxIterIn = 1000, maxIterLine = 500, breakOuter = 1e-08, breakInner = 1e-10, convergenceCriterion = 0, verbose = 0, nCores = 1 )
controlGlmnet( startingValues = "est", initialHessian = ifelse(all(startingValues == "est"), "lavaan", "compute"), saveDetails = FALSE, stepSize = 0.9, sigma = 1e-05, gamma = 0, maxIterOut = 1000, maxIterIn = 1000, maxIterLine = 500, breakOuter = 1e-08, breakInner = 1e-10, convergenceCriterion = 0, verbose = 0, nCores = 1 )
startingValues |
option to provide initial starting values. Only used for the first lambda. Three options are supported. Setting to "est" will use the estimates from the lavaan model object. Setting to "start" will use the starting values of the lavaan model. Finally, a labeled vector with parameter values can be passed to the function which will then be used as starting values. |
initialHessian |
option to provide an initial Hessian to the optimizer. Must have row and column names corresponding to the parameter labels. use getLavaanParameters(lavaanModel) to see those labels. If set to "gradNorm", the maximum of the gradients at the starting values times the stepSize will be used. This is adapted from Optim.jl https://github.com/JuliaNLSolvers/Optim.jl/blob/f43e6084aacf2dabb2b142952acd3fbb0e268439/src/multivariate/solvers/first_order/bfgs.jl#L104 If set to "compute", the initial hessian will be computed. If set to a single value, a diagonal matrix with the single value along the diagonal will be used. The default is "lavaan" which extracts the Hessian from the lavaanModel. This Hessian will typically deviate from that of the internal SEM represenation of lessSEM (due to the transformation of the variances), but works quite well in practice. |
saveDetails |
when set to TRUE, additional details about the individual models are save. Currently, this are the Hessian and the implied means and covariances. Note: This may take a lot of memory! |
stepSize |
Initial stepSize of the outer iteration (theta_next = theta_previous + stepSize * Stepdirection) |
sigma |
only relevant when lineSearch = 'GLMNET'. Controls the sigma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. |
gamma |
Controls the gamma parameter in Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421. Defaults to 0. |
maxIterOut |
Maximal number of outer iterations |
maxIterIn |
Maximal number of inner iterations |
maxIterLine |
Maximal number of iterations for the line search procedure |
breakOuter |
Stopping criterion for outer iterations |
breakInner |
Stopping criterion for inner iterations |
convergenceCriterion |
which convergence criterion should be used for the outer iterations? possible are 0 = GLMNET, 1 = fitChange, 2 = gradients. Note that in case of gradients and GLMNET, we divide the gradients (and the Hessian) of the log-Likelihood by N as it would otherwise be considerably more difficult for larger sample sizes to reach the convergence criteria. |
verbose |
0 prints no additional information, > 0 prints GLMNET iterations |
nCores |
number of core to use. Multi-core support is provided by RcppParallel and only supported for SEM, not for general purpose optimization. |
object of class controlGlmnet
control <- controlGlmnet()
control <- controlGlmnet()
controlIsta
controlIsta( startingValues = "est", saveDetails = FALSE, L0 = 0.1, eta = 2, accelerate = TRUE, maxIterOut = 10000, maxIterIn = 1000, breakOuter = 1e-08, convCritInner = 1, sigma = 0.1, stepSizeInheritance = ifelse(accelerate, 1, 3), verbose = 0, nCores = 1 )
controlIsta( startingValues = "est", saveDetails = FALSE, L0 = 0.1, eta = 2, accelerate = TRUE, maxIterOut = 10000, maxIterIn = 1000, breakOuter = 1e-08, convCritInner = 1, sigma = 0.1, stepSizeInheritance = ifelse(accelerate, 1, 3), verbose = 0, nCores = 1 )
startingValues |
option to provide initial starting values. Only used for the first lambda. Three options are supported. Setting to "est" will use the estimates from the lavaan model object. Setting to "start" will use the starting values of the lavaan model. Finally, a labeled vector with parameter values can be passed to the function which will then be used as starting values. |
saveDetails |
when set to TRUE, additional details about the individual models are save. Currently, this are the implied means and covariances. Note: This may take a lot of memory! |
L0 |
L0 controls the step size used in the first iteration |
eta |
eta controls by how much the step size changes in the inner iterations with (eta^i)*L, where i is the inner iteration |
accelerate |
boolean: Should the acceleration outlined in Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231., p. 152 be used? |
maxIterOut |
maximal number of outer iterations |
maxIterIn |
maximal number of inner iterations |
breakOuter |
change in fit required to break the outer iteration. Note: The value will be multiplied internally with sample size N as the -2log-Likelihood depends directly on the sample size |
convCritInner |
this is related to the inner breaking condition. 0 = ista, as presented by Beck & Teboulle (2009); see Remark 3.1 on p. 191 (ISTA with backtracking) 1 = gist, as presented by Gong et al. (2013) (Equation 3) |
sigma |
sigma in (0,1) is used by the gist convergence criterion. larger sigma enforce larger improvement in fit |
stepSizeInheritance |
how should step sizes be carried forward from iteration to iteration? 0 = resets the step size to L0 in each iteration 1 = takes the previous step size as initial value for the next iteration 3 = Barzilai-Borwein procedure 4 = Barzilai-Borwein procedure, but sometimes resets the step size; this can help when the optimizer is caught in a bad spot. |
verbose |
if set to a value > 0, the fit every "verbose" iterations is printed. |
nCores |
number of core to use. Multi-core support is provided by RcppParallel and only supported for SEM, not for general purpose optimization. |
object of class controlIsta
control <- controlIsta()
control <- controlIsta()
Extract the labels of all covariances found in a lavaan model.
covariances(lavaanModel)
covariances(lavaanModel)
lavaanModel |
fitted lavaan model |
vector with parameter labels
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) covariances(fit)
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) covariances(fit)
create subsets for cross-validation
createSubsets(N, k)
createSubsets(N, k)
N |
number of samples in the data set |
k |
number of subsets to create |
matrix with subsets
createSubsets(N=100, k = 5)
createSubsets(N=100, k = 5)
generates lambda values between 0 and lambdaMax using the function described here: https://math.stackexchange.com/questions/384613/exponential-function-with-values-between-0-and-1-for-x-values-between-0-and-1. The function is identical to the one implemented in the regCtsem package.
curveLambda(maxLambda, lambdasAutoCurve, lambdasAutoLength)
curveLambda(maxLambda, lambdasAutoCurve, lambdasAutoLength)
maxLambda |
maximal lambda value |
lambdasAutoCurve |
controls the curve. A value close to 1 will result in a linear increase, larger values in lambdas more concentrated around 0 |
lambdasAutoLength |
number of lambda values to generate |
numeric vector
library(lessSEM) plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 1, lambdasAutoLength = 100)) plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 5, lambdasAutoLength = 100)) plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 100, lambdasAutoLength = 100))
library(lessSEM) plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 1, lambdasAutoLength = 100)) plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 5, lambdasAutoLength = 100)) plot(curveLambda(maxLambda = 10, lambdasAutoCurve = 100, lambdasAutoLength = 100))
Implements cross-validated adaptive lasso regularization for structural equation models. The penalty function is given by:
Adaptive lasso regularization will set parameters to zero if
is large enough.
cvAdaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
cvAdaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvAdaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1)) # use the plot-function to plot the cross-validation fit plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: estimates(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvAdaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1)) # use the plot-function to plot the cross-validation fit plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: estimates(lsem)
Implements cappedL1 regularization for structural equation models. The penalty function is given by:
where . The cappedL1 penalty is identical to the lasso for
parameters which are below
and identical to a constant for parameters
above
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
cvCappedL1( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
cvCappedL1( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta for more details. |
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvCappedL1( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), thetas = seq(0.01,2,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvCappedL1( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), thetas = seq(0.01,2,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
Implements elastic net regularization for structural equation models. The penalty function is given by:
Note that the elastic net combines ridge and lasso regularization. If ,
the elastic net reduces to ridge regularization. If
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
cvElasticNet( lavaanModel, regularized, lambdas, alphas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
cvElasticNet( lavaanModel, regularized, lambdas, alphas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvElasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvElasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
Implements cross-validated lasso regularization for structural equation models. The penalty function is given by:
Lasso regularization will set parameters to zero if is large enough
cvLasso( lavaanModel, regularized, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
cvLasso( lavaanModel, regularized, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1), k = 5, # number of cross-validation folds standardize = TRUE) # automatic standardization # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: estimates(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1), k = 5, # number of cross-validation folds standardize = TRUE) # automatic standardization # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: estimates(lsem)
Implements lsp regularization for structural equation models. The penalty function is given by:
where .
cvLsp( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
cvLsp( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta |
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvLsp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), thetas = seq(0.01,2,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvLsp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), thetas = seq(0.01,2,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
Implements mcp regularization for structural equation models. The penalty function is given by:
where .
cvMcp( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "ista", control = lessSEM::controlIsta() )
cvMcp( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "ista", control = lessSEM::controlIsta() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta |
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvMcp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), thetas = seq(0.01,2,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvMcp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), thetas = seq(0.01,2,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
Class for cross-validated regularized SEM
parameters
data.frame with parameter estimates for the best combination of the tuning parameters
transformations
transformed parameters
cvfits
data.frame with all combinations of the tuning parameters and the sum of the cross-validation fits
parameterLabels
character vector with names of all parameters
regularized
character vector with names of regularized parameters
cvfitsDetails
data.frame with cross-validation fits for each subset
subsets
matrix indicating which person is in which subset
subsetParameters
optional: data.frame with parameter estimates for all combinations of the tuning parameters in all subsets
misc
list with additional return elements
notes
internal notes that have come up when fitting the model
Implements ridge regularization for structural equation models. The penalty function is given by:
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
cvRidge( lavaanModel, regularized, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
cvRidge( lavaanModel, regularized, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvRidge( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20)) # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvRidge( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20)) # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters
Implements cross-validated ridge regularization for structural equation models. The penalty function is given by:
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
cvRidgeBfgs( lavaanModel, regularized, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
cvRidgeBfgs( lavaanModel, regularized, lambdas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvRidgeBfgs( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20)) # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvRidgeBfgs( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20)) # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters
Implements scad regularization for structural equation models. The penalty function is given by:
where .
cvScad( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
cvScad( lavaanModel, regularized, lambdas, thetas, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures. |
control |
used to control the optimizer. This element is generated with the controlIsta function. See ?controlIsta |
Identical to regsem, models are specified using lavaan. Currenlty,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvScad( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 3), thetas = seq(2.01,5,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvScad( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 3), thetas = seq(2.01,5,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
uses the means and standard deviations of the training set to standardize the test set. See, e.g., https://scikit-learn.org/stable/modules/cross_validation.html .
cvScaler(testSet, means, standardDeviations)
cvScaler(testSet, means, standardDeviations)
testSet |
test data set |
means |
means of the training set |
standardDeviations |
standard deviations of the training set |
scaled test set
library(lessSEM) data <- matrix(rnorm(50),10,5) cvScaler(testSet = data, means = 1:5, standardDeviations = 1:5)
library(lessSEM) data <- matrix(rnorm(50),10,5) cvScaler(testSet = data, means = 1:5, standardDeviations = 1:5)
Implements cross-validated smooth adaptive lasso regularization for structural equation models. The penalty function is given by:
cvSmoothAdaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas, epsilon, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
cvSmoothAdaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas, epsilon, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvSmoothAdaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1), epsilon = 1e-8) # use the plot-function to plot the cross-validation fit plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: coef(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvSmoothAdaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1), epsilon = 1e-8) # use the plot-function to plot the cross-validation fit plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: coef(lsem)
Implements cross-validated smooth elastic net regularization for structural equation models. The penalty function is given by:
Note that the smooth elastic net combines ridge and smooth lasso regularization. If ,
the elastic net reduces to ridge regularization. If
it reduces
to smooth lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
cvSmoothElasticNet( lavaanModel, regularized, lambdas, alphas, epsilon, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
cvSmoothElasticNet( lavaanModel, regularized, lambdas, alphas, epsilon, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvSmoothElasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), epsilon = 1e-8, lambdas = seq(0,1,length.out = 5), alphas = .3) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvSmoothElasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), epsilon = 1e-8, lambdas = seq(0,1,length.out = 5), alphas = .3) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # optional: plotting the cross-validation fit requires installation of plotly # plot(lsem)
Implements cross-validated smooth lasso regularization for structural equation models. The penalty function is given by:
cvSmoothLasso( lavaanModel, regularized, lambdas, epsilon, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
cvSmoothLasso( lavaanModel, regularized, lambdas, epsilon, k = 5, standardize = FALSE, returnSubsetParameters = FALSE, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
k |
the number of cross-validation folds. Alternatively, you can pass a matrix with booleans (TRUE, FALSE) which indicates for each person which subset it belongs to. See ?lessSEM::createSubsets for an example of how this matrix should look like. |
standardize |
Standardizing your data prior to the analysis can undermine the cross- validation. Set standardize=TRUE to automatically standardize the data. |
returnSubsetParameters |
set to TRUE to return the parameters for each training set |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
model of class cvRegularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvSmoothLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1), k = 5, # number of cross-validation folds epsilon = 1e-8, standardize = TRUE) # automatic standardization # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: coef(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- cvSmoothLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,.1), k = 5, # number of cross-validation folds epsilon = 1e-8, standardize = TRUE) # automatic standardization # use the plot-function to plot the cross-validation fit: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters # The best parameters can also be extracted with: coef(lsem)
Implements elastic net regularization for structural equation models. The penalty function is given by:
Note that the elastic net combines ridge and lasso regularization. If ,
the elastic net reduces to ridge regularization. If
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
elasticNet( lavaanModel, regularized, lambdas, alphas, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
elasticNet( lavaanModel, regularized, lambdas, alphas, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the lessSEM::controlIsta() and controlGlmnet() functions. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- elasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # optional: plotting the paths requires installation of plotly # plot(lsem) #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- elasticNet( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3), method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- elasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # optional: plotting the paths requires installation of plotly # plot(lsem) #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- elasticNet( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3), method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
S4 method to exract the estimates of an object
estimates(object, criterion = NULL, transformations = FALSE)
estimates(object, criterion = NULL, transformations = FALSE)
object |
a model fitted with lessSEM |
criterion |
fitIndice used to select the parameters |
transformations |
boolean: Should transformations be returned? |
returns a matrix with estimates
estimates
## S4 method for signature 'cvRegularizedSEM' estimates(object, criterion = NULL, transformations = FALSE)
## S4 method for signature 'cvRegularizedSEM' estimates(object, criterion = NULL, transformations = FALSE)
object |
object of class cvRegularizedSEM |
criterion |
not used |
transformations |
boolean: Should transformations be returned? |
returns a matrix with estimates
estimates
## S4 method for signature 'regularizedSEM' estimates(object, criterion = NULL, transformations = FALSE)
## S4 method for signature 'regularizedSEM' estimates(object, criterion = NULL, transformations = FALSE)
object |
object of class regularizedSEM |
criterion |
fit index (e.g., AIC) used to select the parameters |
transformations |
boolean: Should transformations be returned? |
returns a matrix with estimates
estimates
## S4 method for signature 'regularizedSEMMixedPenalty' estimates(object, criterion = NULL, transformations = FALSE)
## S4 method for signature 'regularizedSEMMixedPenalty' estimates(object, criterion = NULL, transformations = FALSE)
object |
object of class regularizedSEMMixedPenalty |
criterion |
fit index (e.g., AIC) used to select the parameters |
transformations |
boolean: Should transformations be returned? |
returns a matrix with estimates
Optimizes an object with mixed penalty. See ?mixedPenalty for more details.
fit(mixedPenalty)
fit(mixedPenalty)
mixedPenalty |
object of class mixedPenalty. This object can be created with the mixedPenalty function. Penalties can be added with the addCappedL1, addElastiNet, addLasso, addLsp, addMcp, and addScad functions. |
throws error in case of undefined penalty combinations.
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addElasticNet(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), alphas = .4) |> # fit the model: fit()
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addElasticNet(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), alphas = .4) |> # fit the model: fit()
S4 method to compute fit indices (e.g., AIC, BIC, ...)
fitIndices(object)
fitIndices(object)
object |
a model fitted with lessSEM |
returns a data.frame with fit indices
fitIndices
## S4 method for signature 'cvRegularizedSEM' fitIndices(object)
## S4 method for signature 'cvRegularizedSEM' fitIndices(object)
object |
object of class cvRegularizedSEM |
returns a data.frame with fit indices
fitIndices
## S4 method for signature 'regularizedSEM' fitIndices(object)
## S4 method for signature 'regularizedSEM' fitIndices(object)
object |
object of class regularizedSEM |
returns a data.frame with fit indices
fitIndices
## S4 method for signature 'regularizedSEMMixedPenalty' fitIndices(object)
## S4 method for signature 'regularizedSEMMixedPenalty' fitIndices(object)
object |
object of class regularizedSEMMixedPenalty |
returns a data.frame with fit indices
helper function: returns a labeled vector with parameters from lavaan
getLavaanParameters(lavaanModel, removeDuplicates = TRUE)
getLavaanParameters(lavaanModel, removeDuplicates = TRUE)
lavaanModel |
model of class lavaan |
removeDuplicates |
should duplicated parameters be removed? |
returns a labeled vector with parameters from lavaan
library(lessSEM) dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) getLavaanParameters(lavaanModel)
library(lessSEM) dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) getLavaanParameters(lavaanModel)
Returns the lambda, theta, and alpha values for the tuning parameters of a regularized SEM with mixed penalty.
getTuningParameterConfiguration( regularizedSEMMixedPenalty, tuningParameterConfiguration )
getTuningParameterConfiguration( regularizedSEMMixedPenalty, tuningParameterConfiguration )
regularizedSEMMixedPenalty |
object of type regularizedSEMMixedPenalty (see ?mixedPenalty) |
tuningParameterConfiguration |
integer indicating which tuningParameterConfiguration should be extracted (e.g., 1). See the entry in the row tuningParameterConfiguration of regularizedSEMMixedPenalty@fits and regularizedSEMMixedPenalty@parameters. |
data frame with penalty and tuning parameter settings
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addLsp(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit() getTuningParameterConfiguration(regularizedSEMMixedPenalty = regularized, tuningParameterConfiguration = 2)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # We can add mixed penalties as follows: regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add penalty on loadings l6 - l10: addLsp(regularized = paste0("l", 11:15), lambdas = seq(0,1,.1), thetas = 2.3) |> # fit the model: fit() getTuningParameterConfiguration(regularizedSEMMixedPenalty = regularized, tuningParameterConfiguration = 2)
Object for cappedL1 optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for cappedL1 optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for elastic net optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, an R function to compute the fit, an R function to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
Object for elastic net optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEXP function pointer to compute the fit, a SEXP function pointer to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
Object for elastic net optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
Object for elastic net optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
Object for lsp optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for lsp optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mcp optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mcp optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mixed optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mixed optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mixed optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mixed optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for scad optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (2) a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for scad optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires a list with control elements
setHessian
changes the Hessian of the model. Expects a matrix
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Implements adaptive lasso regularization for general purpose optimization problems. The penalty function is given by:
Adaptive lasso regularization will set parameters to zero if is large enough.
gpAdaptiveLasso( par, regularized, weights = NULL, fn, gr = NULL, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
gpAdaptiveLasso( par, regularized, weights = NULL, fn, gr = NULL, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
weights |
labeled vector with adaptive lasso weights. NULL will use 1/abs(par) |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # define the weight for each of the parameters weights <- 1/abs(b) # we will re-scale the weights for equivalence to glmnet. # see ?glmnet for more details weights <- length(b)*weights/sum(weights) # optimize adaptiveLassoPen <- gpAdaptiveLasso( par = b, regularized = regularized, weights = weights, fn = fittingFunction, lambdas = seq(0,1,.01), X = X, y = y, N = N ) plot(adaptiveLassoPen) # You can access the fit results as follows: adaptiveLassoPen@fits # Note that we won't compute any fit measures automatically, as # we cannot be sure how the AIC, BIC, etc are defined for your objective function # for comparison: # library(glmnet) # coef(glmnet(x = X, # y = y, # penalty.factor = weights, # lambda = adaptiveLassoPen@fits$lambda[20], # intercept = FALSE, # standardize = FALSE))[,1] # adaptiveLassoPen@parameters[20,]
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # define the weight for each of the parameters weights <- 1/abs(b) # we will re-scale the weights for equivalence to glmnet. # see ?glmnet for more details weights <- length(b)*weights/sum(weights) # optimize adaptiveLassoPen <- gpAdaptiveLasso( par = b, regularized = regularized, weights = weights, fn = fittingFunction, lambdas = seq(0,1,.01), X = X, y = y, N = N ) plot(adaptiveLassoPen) # You can access the fit results as follows: adaptiveLassoPen@fits # Note that we won't compute any fit measures automatically, as # we cannot be sure how the AIC, BIC, etc are defined for your objective function # for comparison: # library(glmnet) # coef(glmnet(x = X, # y = y, # penalty.factor = weights, # lambda = adaptiveLassoPen@fits$lambda[20], # intercept = FALSE, # standardize = FALSE))[,1] # adaptiveLassoPen@parameters[20,]
Implements adaptive lasso regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
Adaptive lasso regularization will set parameters to zero if is large enough.
gpAdaptiveLassoCpp( par, regularized, weights = NULL, fn, gr, lambdas = NULL, nLambdas = NULL, curve = 1, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
gpAdaptiveLassoCpp( par, regularized, weights = NULL, fn, gr, lambdas = NULL, nLambdas = NULL, curve = 1, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
weights |
labeled vector with adaptive lasso weights. NULL will use 1/abs(par) |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Adaptive lasso regularization:
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) al1 <- gpAdaptiveLassoCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), additionalArguments = data) al1@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) al1 <- gpAdaptiveLassoCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), additionalArguments = data) al1@parameters
Implements cappedL1 regularization for general purpose optimization problems. The penalty function is given by:
where . The cappedL1 penalty is identical to the lasso for
parameters which are below
and identical to a constant for parameters
above
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
gpCappedL1( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpCappedL1( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) # This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize cL1 <- gpCappedL1( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(0.001, .5, 1), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(cL1) # for comparison fittingFunction <- function(par, y, X, N, lambda, theta){ pred <- X %*% matrix(par, ncol = 1) sse <- sum((y - pred)^2) smoothAbs <- sqrt(par^2 + 1e-8) pen <- lambda * ifelse(smoothAbs < theta, smoothAbs, theta) return((.5/N)*sse + sum(pen)) } round( optim(par = b, fn = fittingFunction, y = y, X = X, N = N, lambda = cL1@fits$lambda[15], theta = cL1@fits$theta[15], method = "BFGS")$par, 4) cL1@parameters[15,]
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) # This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize cL1 <- gpCappedL1( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(0.001, .5, 1), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(cL1) # for comparison fittingFunction <- function(par, y, X, N, lambda, theta){ pred <- X %*% matrix(par, ncol = 1) sse <- sum((y - pred)^2) smoothAbs <- sqrt(par^2 + 1e-8) pen <- lambda * ifelse(smoothAbs < theta, smoothAbs, theta) return((.5/N)*sse + sum(pen)) } round( optim(par = b, fn = fittingFunction, y = y, X = X, N = N, lambda = cL1@fits$lambda[15], theta = cL1@fits$theta[15], method = "BFGS")$par, 4) cL1@parameters[15,]
Implements cappedL1 regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
where . The cappedL1 penalty is identical to the lasso for
parameters which are below
and identical to a constant for parameters
above
. As adding a constant to the fitting function will not change its
minimum, larger parameters can stay unregularized while smaller ones are set to zero.
gpCappedL1Cpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpCappedL1Cpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
CappedL1 regularization:
Zhang, T. (2010). Analysis of Multi-stage Convex Relaxation for Sparse Regularization. Journal of Machine Learning Research, 11, 1081–1107.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) cL1 <- gpCappedL1Cpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(0.1,1,.1), additionalArguments = data) cL1@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) cL1 <- gpCappedL1Cpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(0.1,1,.1), additionalArguments = data) cL1@parameters
Implements elastic net regularization for general purpose optimization problems. The penalty function is given by:
Note that the elastic net combines ridge and lasso regularization. If ,
the elastic net reduces to ridge regularization. If
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
gpElasticNet( par, regularized, fn, gr = NULL, lambdas, alphas, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
gpElasticNet( par, regularized, fn, gr = NULL, lambdas, alphas, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize elasticNetPen <- gpElasticNet( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), alphas = c(0, .5, 1), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(elasticNetPen) # for comparison: fittingFunction <- function(par, y, X, N, lambda, alpha){ pred <- X %*% matrix(par, ncol = 1) sse <- sum((y - pred)^2) return((.5/N)*sse + (1-alpha)*lambda * sum(par^2) + alpha*lambda *sum(sqrt(par^2 + 1e-8))) } round( optim(par = b, fn = fittingFunction, y = y, X = X, N = N, lambda = elasticNetPen@fits$lambda[15], alpha = elasticNetPen@fits$alpha[15], method = "BFGS")$par, 4) elasticNetPen@parameters[15,]
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize elasticNetPen <- gpElasticNet( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), alphas = c(0, .5, 1), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(elasticNetPen) # for comparison: fittingFunction <- function(par, y, X, N, lambda, alpha){ pred <- X %*% matrix(par, ncol = 1) sse <- sum((y - pred)^2) return((.5/N)*sse + (1-alpha)*lambda * sum(par^2) + alpha*lambda *sum(sqrt(par^2 + 1e-8))) } round( optim(par = b, fn = fittingFunction, y = y, X = X, N = N, lambda = elasticNetPen@fits$lambda[15], alpha = elasticNetPen@fits$alpha[15], method = "BFGS")$par, 4) elasticNetPen@parameters[15,]
Implements elastic net regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
Note that the elastic net combines ridge and lasso regularization. If ,
the elastic net reduces to ridge regularization. If
it reduces
to lasso regularization. In between, elastic net is a compromise between the shrinkage of
the lasso and the ridge penalty.
gpElasticNetCpp( par, regularized, fn, gr, lambdas, alphas, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
gpElasticNetCpp( par, regularized, fn, gr, lambdas, alphas, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Elastic net regularization:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) en <- gpElasticNetCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), alphas = c(0,.5,1), additionalArguments = data) en@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) en <- gpElasticNetCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), alphas = c(0,.5,1), additionalArguments = data) en@parameters
Implements lasso regularization for general purpose optimization problems. The penalty function is given by:
Lasso regularization will set parameters to zero if is large enough
gpLasso( par, regularized, fn, gr = NULL, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
gpLasso( par, regularized, fn, gr = NULL, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- rep(0,p) names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize lassoPen <- gpLasso( par = b, regularized = regularized, fn = fittingFunction, nLambdas = 100, X = X, y = y, N = N ) plot(lassoPen) # You can access the fit results as follows: lassoPen@fits # Note that we won't compute any fit measures automatically, as # we cannot be sure how the AIC, BIC, etc are defined for your objective function
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- rep(0,p) names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize lassoPen <- gpLasso( par = b, regularized = regularized, fn = fittingFunction, nLambdas = 100, X = X, y = y, N = N ) plot(lassoPen) # You can access the fit results as follows: lassoPen@fits # Note that we won't compute any fit measures automatically, as # we cannot be sure how the AIC, BIC, etc are defined for your objective function
Implements lasso regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
Lasso regularization will set parameters to zero if is large enough
gpLassoCpp( par, regularized, fn, gr, lambdas = NULL, nLambdas = NULL, curve = 1, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
gpLassoCpp( par, regularized, fn, gr, lambdas = NULL, nLambdas = NULL, curve = 1, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
pointer to Rcpp function which takes the parameters as input and returns the fit value (a single value) |
gr |
pointer to Rcpp function which takes the parameters as input and returns the gradients of the objective function. |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) l1 <- gpLassoCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), additionalArguments = data) l1@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) l1 <- gpLassoCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), additionalArguments = data) l1@parameters
Implements lsp regularization for general purpose optimization problems. The penalty function is given by:
gpLsp( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpLsp( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize lspPen <- gpLsp( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(0.001, .5, 1), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(lspPen) # for comparison fittingFunction <- function(par, y, X, N, lambda, theta){ pred <- X %*% matrix(par, ncol = 1) sse <- sum((y - pred)^2) smoothAbs <- sqrt(par^2 + 1e-8) pen <- lambda * log(1.0 + smoothAbs / theta) return((.5/N)*sse + sum(pen)) } round( optim(par = b, fn = fittingFunction, y = y, X = X, N = N, lambda = lspPen@fits$lambda[15], theta = lspPen@fits$theta[15], method = "BFGS")$par, 4) lspPen@parameters[15,]
library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize lspPen <- gpLsp( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(0.001, .5, 1), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(lspPen) # for comparison fittingFunction <- function(par, y, X, N, lambda, theta){ pred <- X %*% matrix(par, ncol = 1) sse <- sum((y - pred)^2) smoothAbs <- sqrt(par^2 + 1e-8) pen <- lambda * log(1.0 + smoothAbs / theta) return((.5/N)*sse + sum(pen)) } round( optim(par = b, fn = fittingFunction, y = y, X = X, N = N, lambda = lspPen@fits$lambda[15], theta = lspPen@fits$theta[15], method = "BFGS")$par, 4) lspPen@parameters[15,]
Implements lsp regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
where .
gpLspCpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpLspCpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) l <- gpLspCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(0.1,1,.1), additionalArguments = data) l@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) l <- gpLspCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(0.1,1,.1), additionalArguments = data) l@parameters
Implements mcp regularization for general purpose optimization problems. The penalty function is given by:
where .
gpMcp( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpMcp( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: # first, let's add an intercept X <- cbind(1, X) b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 0:(length(b)-1)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize mcpPen <- gpMcp( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(1.001, 1.5, 2), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(mcpPen)
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: # first, let's add an intercept X <- cbind(1, X) b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 0:(length(b)-1)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize mcpPen <- gpMcp( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(1.001, 1.5, 2), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(mcpPen)
Implements mcp regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
where .
gpMcpCpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpMcpCpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) m <- gpMcpCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(.1,1,.1), additionalArguments = data) m@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) m <- gpMcpCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(.1,1,.1), additionalArguments = data) m@parameters
Class for regularized model using general purpose optimization interface
penalty
penalty used (e.g., "lasso")
parameters
data.frame with all parameter estimates
fits
data.frame with all fit results
parameterLabels
character vector with names of all parameters
weights
vector with weights given to each of the parameters in the penalty
regularized
character vector with names of regularized parameters
internalOptimization
list of elements used internally
inputArguments
list with elements passed by the user to the general purpose optimizer
Implements ridge regularization for general purpose optimization problems. The penalty function is given by:
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
gpRidge( par, regularized, fn, gr = NULL, lambdas, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
gpRidge( par, regularized, fn, gr = NULL, lambdas, ..., method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
... |
additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize ridgePen <- gpRidge( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.01), X = X, y = y, N = N ) plot(ridgePen) # for comparison: # fittingFunction <- function(par, y, X, N, lambda){ # pred <- X %*% matrix(par, ncol = 1) # sse <- sum((y - pred)^2) # return((.5/N)*sse + lambda * sum(par^2)) # } # # optim(par = b, # fn = fittingFunction, # y = y, # X = X, # N = N, # lambda = ridgePen@fits$lambda[20], # method = "BFGS")$par # ridgePen@parameters[20,]
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 1:length(b)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize ridgePen <- gpRidge( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.01), X = X, y = y, N = N ) plot(ridgePen) # for comparison: # fittingFunction <- function(par, y, X, N, lambda){ # pred <- X %*% matrix(par, ncol = 1) # sse <- sum((y - pred)^2) # return((.5/N)*sse + lambda * sum(par^2)) # } # # optim(par = b, # fn = fittingFunction, # y = y, # X = X, # N = N, # lambda = ridgePen@fits$lambda[20], # method = "BFGS")$par # ridgePen@parameters[20,]
Implements ridge regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
gpRidgeCpp( par, regularized, fn, gr, lambdas, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
gpRidgeCpp( par, regularized, fn, gr, lambdas, additionalArguments, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
regularized |
vector with names of parameters which are to be regularized. |
fn |
R function which takes the parameters as input and returns the fit value (a single value) |
gr |
R function which takes the parameters as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
lambdas |
numeric vector: values for the tuning parameter lambda |
additionalArguments |
list with additional arguments passed to fn and gr |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) r <- gpRidgeCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), additionalArguments = data) r@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) r <- gpRidgeCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), additionalArguments = data) r@parameters
Implements scad regularization for general purpose optimization problems. The penalty function is given by:
where .
gpScad( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpScad( par, fn, gr = NULL, ..., regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
... |
additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is similar to that of optim. Users have to supply a vector with starting values (important: This vector must have labels) and a fitting function. This fitting functions must take a labeled vector with parameter values as first argument. The remaining arguments are passed with the ... argument. This is similar to optim.
The gradient function gr is optional. If set to NULL, the numDeriv package will be used to approximate the gradients. Supplying a gradient function can result in considerable speed improvements.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: # first, let's add an intercept X <- cbind(1, X) b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 0:(length(b)-1)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize scadPen <- gpScad( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(2.001, 2.5, 5), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(scadPen) # for comparison #library(ncvreg) #scadFit <- ncvreg(X = X[,-1], # y = y, # penalty = "SCAD", # lambda = scadPen@fits$lambda[15], # gamma = scadPen@fits$theta[15]) #coef(scadFit) #scadPen@parameters[15,]
# This example shows how to use the optimizers # for other objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(lessSEM) set.seed(123) # first, we simulate data for our # linear regression. N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) # First, we must construct a fiting function # which returns a single value. We will use # the residual sum squared as fitting function. # Let's start setting up the fitting function: fittingFunction <- function(par, y, X, N){ # par is the parameter vector # y is the observed dependent variable # X is the design matrix # N is the sample size pred <- X %*% matrix(par, ncol = 1) #be explicit here: # we need par to be a column vector sse <- sum((y - pred)^2) # we scale with .5/N to get the same results as glmnet return((.5/N)*sse) } # let's define the starting values: # first, let's add an intercept X <- cbind(1, X) b <- c(solve(t(X)%*%X)%*%t(X)%*%y) # we will use the lm estimates names(b) <- paste0("b", 0:(length(b)-1)) # names of regularized parameters regularized <- paste0("b",1:p) # optimize scadPen <- gpScad( par = b, regularized = regularized, fn = fittingFunction, lambdas = seq(0,1,.1), thetas = c(2.001, 2.5, 5), X = X, y = y, N = N ) # optional: plot requires plotly package # plot(scadPen) # for comparison #library(ncvreg) #scadFit <- ncvreg(X = X[,-1], # y = y, # penalty = "SCAD", # lambda = scadPen@fits$lambda[15], # gamma = scadPen@fits$theta[15]) #coef(scadFit) #scadPen@parameters[15,]
Implements scad regularization for general purpose optimization problems with C++ functions. The penalty function is given by:
where .
gpScadCpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
gpScadCpp( par, fn, gr, additionalArguments, regularized, lambdas, thetas, method = "glmnet", control = lessSEM::controlGlmnet() )
par |
labeled vector with starting values |
fn |
R function which takes the parameters AND their labels as input and returns the fit value (a single value) |
gr |
R function which takes the parameters AND their labels as input and returns the gradients of the objective function. If set to NULL, numDeriv will be used to approximate the gradients |
additionalArguments |
list with additional arguments passed to fn and gr |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
numeric vector: values for the tuning parameter theta |
method |
which optimizer should be used? Currently implemented are ista and glmnet. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The interface is inspired by optim, but a bit more restrictive. Users have to supply a vector with starting values (important: This vector must have labels), a fitting function, and a gradient function. These fitting functions must take an const Rcpp::NumericVector& with parameter values as first argument and an Rcpp::List& as second argument
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Object of class gpRegularized
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) s <- gpScadCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(2.1,3,.1), additionalArguments = data) s@parameters
# This example shows how to use the optimizers # for C++ objective functions. We will use # a linear regression as an example. Note that # this is not a useful application of the optimizers # as there are specialized packages for linear regression # (e.g., glmnet) library(Rcpp) library(lessSEM) linreg <- ' // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> // [[Rcpp::export]] double fitfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // compute the sum of squared errors: arma::mat sse = arma::trans(y-X*b)*(y-X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well sse *= 1.0/(2.0 * y.n_elem); // note: We must return a double, but the sse is a matrix // To get a double, just return the single value that is in // this matrix: return(sse(0,0)); } // [[Rcpp::export]] arma::rowvec gradientfunction(const Rcpp::NumericVector& parameters, Rcpp::List& data){ // extract all required elements: arma::colvec b = Rcpp::as<arma::colvec>(parameters); arma::colvec y = Rcpp::as<arma::colvec>(data["y"]); // the dependent variable arma::mat X = Rcpp::as<arma::mat>(data["X"]); // the design matrix // note: we want to return our gradients as row-vector; therefore, // we have to transpose the resulting column-vector: arma::rowvec gradients = arma::trans(-2.0*X.t() * y + 2.0*X.t()*X*b); // other packages, such as glmnet, scale the sse with // 1/(2*N), where N is the sample size. We will do that here as well gradients *= (.5/y.n_rows); return(gradients); } // Dirk Eddelbuettel at // https://gallery.rcpp.org/articles/passing-cpp-function-pointers/ typedef double (*fitFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<fitFunPtr> fitFunPtr_t; typedef arma::rowvec (*gradientFunPtr)(const Rcpp::NumericVector&, //parameters Rcpp::List& //additional elements ); typedef Rcpp::XPtr<gradientFunPtr> gradientFunPtr_t; // [[Rcpp::export]] fitFunPtr_t fitfunPtr() { return(fitFunPtr_t(new fitFunPtr(&fitfunction))); } // [[Rcpp::export]] gradientFunPtr_t gradfunPtr() { return(gradientFunPtr_t(new gradientFunPtr(&gradientfunction))); } ' Rcpp::sourceCpp(code = linreg) ffp <- fitfunPtr() gfp <- gradfunPtr() N <- 100 # number of persons p <- 10 # number of predictors X <- matrix(rnorm(N*p), nrow = N, ncol = p) # design matrix b <- c(rep(1,4), rep(0,6)) # true regression weights y <- X%*%matrix(b,ncol = 1) + rnorm(N,0,.2) data <- list("y" = y, "X" = cbind(1,X)) parameters <- rep(0, ncol(data$X)) names(parameters) <- paste0("b", 0:(length(parameters)-1)) s <- gpScadCpp(par = parameters, regularized = paste0("b", 1:(length(b)-1)), fn = ffp, gr = gfp, lambdas = seq(0,1,.1), thetas = seq(2.1,3,.1), additionalArguments = data) s@parameters
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, an R function to compute the fit, an R function to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEXP function pointer to compute the fit, a SEXP function pointer to compute the gradients, a list with elements the fit and gradient function require, a lambda and an alpha value.
Object for elastic net optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
Object for elastic net optimization with glmnet optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a lambda and an alpha value.
Object for lsp optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for lsp optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mcp optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for mcp optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object.
optimize
optimize the model.
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter, (2) a vector indicating which penalty is used, and (3) a list with control elements
optimize
optimize the model.
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter, (2) a vector indicating which penalty is used, and (3) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
Object for elastic net optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter, (2) a vector indicating which penalty is used, and (3) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta value, a lambda and an alpha value (alpha must be 1).
Object for scad optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Object for scad optimization with ista optimizer
a list with fit results
new
creates a new object. Requires (1) a vector with weights for each parameter and (2) a list with control elements
optimize
optimize the model. Expects a vector with starting values, a SEM of type SEM_Cpp, a theta and a lambda value.
Implements lasso regularization for structural equation models. The penalty function is given by:
Lasso regularization will set parameters to zero if is large enough
lasso( lavaanModel, regularized, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lasso( lavaanModel, regularized, lambdas = NULL, nLambdas = NULL, reverse = TRUE, curve = 1, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
reverse |
if set to TRUE and nLambdas is used, lessSEM will start with the largest lambda and gradually decrease lambda. Otherwise, lessSEM will start with the smallest lambda and gradually increase it. |
curve |
Allows for unequally spaced lambda steps (e.g., .01,.02,.05,1,5,20). If curve is close to 1 all lambda values will be equally spaced, if curve is large lambda values will be more concentrated close to 0. See ?lessSEM::curveLambda for more information. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Lasso regularization:
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- lasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), # in case of lasso and adaptive lasso, we can specify the number of lambda # values to use. lessSEM will automatically find lambda_max and fit # models for nLambda values between 0 and lambda_max. For the other # penalty functions, lambdas must be specified explicitly nLambdas = 50) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- lasso( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), nLambdas = 50, method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- lasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), # in case of lasso and adaptive lasso, we can specify the number of lambda # values to use. lessSEM will automatically find lambda_max and fit # models for nLambda values between 0 and lambda_max. For the other # penalty functions, lambdas must be specified explicitly nLambdas = 50) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- lasso( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), nLambdas = 50, method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
helper function: lslx and lavaan use slightly different parameter labels. This function can be used to get both sets of labels.
lavaan2lslxLabels(lavaanModel)
lavaan2lslxLabels(lavaanModel)
lavaanModel |
model of class lavaan |
list with lavaan labels and lslx labels
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) lavaan2lslxLabels(lavaanModel)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) lavaan2lslxLabels(lavaanModel)
Creates a lavaan model object from lessSEM (only if possible). Pass either a criterion or a combination of lambda, alpha, and theta.
lessSEM2Lavaan( regularizedSEM, criterion = NULL, lambda = NULL, alpha = NULL, theta = NULL )
lessSEM2Lavaan( regularizedSEM, criterion = NULL, lambda = NULL, alpha = NULL, theta = NULL )
regularizedSEM |
object created with lessSEM |
criterion |
criterion used for model selection. Currently supported are "AIC" or "BIC" |
lambda |
value for tuning parameter lambda |
alpha |
value for tuning parameter alpha |
theta |
value for tuning parameter theta |
lavaan model
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: regularized <- lasso(lavaanModel, regularized = paste0("l", 11:15), lambdas = seq(0,1,.1)) # using criterion lessSEM2Lavaan(regularizedSEM = regularized, criterion = "AIC") # using tuning parameters (note: we only have to specify the tuning # parameters that are actually used by the penalty function. In case # of lasso, this is lambda): lessSEM2Lavaan(regularizedSEM = regularized, lambda = 1)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: regularized <- lasso(lavaanModel, regularized = paste0("l", 11:15), lambdas = seq(0,1,.1)) # using criterion lessSEM2Lavaan(regularizedSEM = regularized, criterion = "AIC") # using tuning parameters (note: we only have to specify the tuning # parameters that are actually used by the penalty function. In case # of lasso, this is lambda): lessSEM2Lavaan(regularizedSEM = regularized, lambda = 1)
Class for the coefficients estimated by lessSEM.
tuningParameters
tuning parameters
estimates
parameter estimates
transformations
transformations of parameters
Extract the labels of all loadings found in a lavaan model.
loadings(lavaanModel)
loadings(lavaanModel)
lavaanModel |
fitted lavaan model |
vector with parameter labels
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) loadings(fit)
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) loadings(fit)
Returns the rows for which all elements of a boolean matrix X are equal to the elements in boolean vector x
logicalMatch(X, x)
logicalMatch(X, x)
X |
matrix with booleans |
x |
vector of booleans |
numerical vector with indices of matching rows
logLik
## S4 method for signature 'Rcpp_mgSEM' logLik(object, ...)
## S4 method for signature 'Rcpp_mgSEM' logLik(object, ...)
object |
object of class Rcpp_mgSEM |
... |
not used |
log-likelihood of the model
logLik
## S4 method for signature 'Rcpp_SEMCpp' logLik(object, ...)
## S4 method for signature 'Rcpp_SEMCpp' logLik(object, ...)
object |
object of class Rcpp_SEMCpp |
... |
not used |
log-likelihood of the model
Class for log-likelihood of regularized SEM. Note: we define a custom logLik - Function because the generic one is using df = number of parameters which might be confusing.
logLik
log-Likelihood
nParameters
number of parameters in the model
N
number of persons in the data set
Implements lsp regularization for structural equation models. The penalty function is given by:
where .
lsp( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lsp( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
lsp regularization:
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing Sparsity by Reweighted l1 Minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905. https://doi.org/10.1007/s00041-008-9045-x
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- lsp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(0.01,2,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- lsp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(0.01,2,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
This function helps you create the pointers necessary to use the Cpp interface
makePtrs(fitFunName, gradFunName)
makePtrs(fitFunName, gradFunName)
fitFunName |
name of your C++ fit function (IMPORTANT: This must be the name used in C++) |
gradFunName |
name of your C++ gradient function (IMPORTANT: This must be the name used in C++) |
a string which can be copied in the C++ function to create the pointers.
# see vignette("General-Purpose-Optimization", package = "lessSEM") for an example
# see vignette("General-Purpose-Optimization", package = "lessSEM") for an example
Implements mcp regularization for structural equation models. The penalty function is given by:
where .
mcp( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "ista", control = lessSEM::controlIsta() )
mcp( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "ista", control = lessSEM::controlIsta() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
In our experience, the glmnet optimizer can run in issues with the mcp penalty. Therefor, we default to using ista.
mcp regularization:
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38(2), 894–942. https://doi.org/10.1214/09-AOS729
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- mcp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(0.01,2,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- mcp( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(0.01,2,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
mcpPenalty_C
mcpPenalty_C(par, lambda_p, theta)
mcpPenalty_C(par, lambda_p, theta)
par |
single parameter value |
lambda_p |
lambda value for this parameter |
theta |
theta value for this parameter |
penalty value
internal mgSEM representation
new
Creates a new mgSEM.
addModel
add a model. Expects Rcpp::List
addTransformation
adds transforamtions to a model
implied
Computes implied means and covariance matrix
fit
Fits the model. Returns objective value of the fitting function
getParameters
Returns a data frame with model parameters.
getParameterLabels
Returns a vector with unique parameter labels as used internally.
getEstimator
Returns a vector with names of the estimators used in the submodels.
getGradients
Returns a matrix with scores.
getScores
Returns a matrix with scores. Not yet implemented
getHessian
Returns the hessian of the model. Expects the labels of the parameters and the values of the parameters as well as a boolean indicating if these are raw. Finally, a double (eps) controls the precision of the approximation.
computeTransformations
compute the transformations.
setTransformationGradientStepSize
change the step size of the gradient computation for the transformations
Provides possibility to impose different penalties on different parameters.
mixedPenalty( lavaanModel, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
mixedPenalty( lavaanModel, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently supported are "glmnet" and "ista". |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
The mixedPenalty
function allows you to add multiple penalties to a single model.
For instance, you may want to regularize both loadings and regressions in a SEM.
In this case, using the same penalty (e.g., lasso) for both types of penalties may
actually not be what you want to use because the penalty function is sensitive to
the scales of the parameters. Instead, you may want to use two separate lasso
penalties for loadings and regressions. Similarly, separate penalties for
different parameters have, for instance, been proposed in multi-group models
(Geminiani et al., 2021).
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well. Models are fitted with the glmnet or ista optimizer. Note that the
optimizers differ in which penalties they support. The following table provides
an overview:
Penalty | Function | glmnet | ista |
lasso | addLasso | x | x |
elastic net | addElasticNet | x* | - |
cappedL1 | addCappedL1 | x | x |
lsp | addLsp | x | x |
scad | addScad | x | x |
mcp | addMcp | x | x |
By default, glmnet will be used. Note that the elastic net penalty can only be combined with other elastic net penalties.
Check vignette(topic = "Mixed-Penalties", package = "lessSEM") for more details.
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Geminiani, E., Marra, G., & Moustaki, I. (2021). Single- and multiple-group penalized factor analysis: A trust-region algorithm approach with integrated automatic multiple tuning parameter selection. Psychometrika, 86(1), 65–95. https://doi.org/10.1007/s11336-021-09751-8
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # In this example, we want to regularize the loadings l6-l10 # independently of the loadings l11-15. This could, for instance, # reflect that the items y6-y10 and y11-y15 may belong to different # subscales. regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add lasso penalty on loadings l6 - l10: addLasso(regularized = paste0("l", 6:10), lambdas = seq(0,1,length.out = 4)) |> # add scad penalty on loadings l11 - l15: addScad(regularized = paste0("l", 11:15), lambdas = seq(0,1,length.out = 3), thetas = 3.1) |> # fit the model: fit() # elements of regularized can be accessed with the @ operator: regularized@parameters[1,] # AIC and BIC: AIC(regularized) BIC(regularized) # The best parameters can also be extracted with: coef(regularized, criterion = "AIC") coef(regularized, criterion = "BIC") # The tuningParameterConfiguration corresponds to the rows # in the lambda, theta, and alpha matrices in regularized@tuningParamterConfigurations. # Configuration 3, for example, is given by regularized@tuningParameterConfigurations$lambda[3,] regularized@tuningParameterConfigurations$theta[3,] regularized@tuningParameterConfigurations$alpha[3,] # Note that lambda, theta, and alpha may correspond to tuning parameters # of different penalties for different parameters (e.g., lambda for l6 is the lambda # of the lasso penalty, while lambda for l12 is the lambda of the scad penalty).
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # In this example, we want to regularize the loadings l6-l10 # independently of the loadings l11-15. This could, for instance, # reflect that the items y6-y10 and y11-y15 may belong to different # subscales. regularized <- lavaanModel |> # create template for regularized model with mixed penalty: mixedPenalty() |> # add lasso penalty on loadings l6 - l10: addLasso(regularized = paste0("l", 6:10), lambdas = seq(0,1,length.out = 4)) |> # add scad penalty on loadings l11 - l15: addScad(regularized = paste0("l", 11:15), lambdas = seq(0,1,length.out = 3), thetas = 3.1) |> # fit the model: fit() # elements of regularized can be accessed with the @ operator: regularized@parameters[1,] # AIC and BIC: AIC(regularized) BIC(regularized) # The best parameters can also be extracted with: coef(regularized, criterion = "AIC") coef(regularized, criterion = "BIC") # The tuningParameterConfiguration corresponds to the rows # in the lambda, theta, and alpha matrices in regularized@tuningParamterConfigurations. # Configuration 3, for example, is given by regularized@tuningParameterConfigurations$lambda[3,] regularized@tuningParameterConfigurations$theta[3,] regularized@tuningParameterConfigurations$alpha[3,] # Note that lambda, theta, and alpha may correspond to tuning parameters # of different penalties for different parameters (e.g., lambda for l6 is the lambda # of the lasso penalty, while lambda for l12 is the lambda of the scad penalty).
Modify the model from lavaan to fit your needs
modifyModel( addMeans = FALSE, activeSet = NULL, dataSet = NULL, transformations = NULL, transformationList = list(), transformationGradientStepSize = 1e-06 )
modifyModel( addMeans = FALSE, activeSet = NULL, dataSet = NULL, transformations = NULL, transformationList = list(), transformationGradientStepSize = 1e-06 )
addMeans |
If lavaanModel has meanstructure = FALSE, addMeans = TRUE will add a mean structure. FALSE will set the means of the observed variables to their observed means. |
activeSet |
Option to only use a subset of the individuals in the data set. Logical vector of length N indicating which subjects should remain in the sample. |
dataSet |
option to replace the data set in the lavaan model with a different data set. Can be useful for cross-validation |
transformations |
allows for transformations of parameters - useful for measurement invariance tests etc. |
transformationList |
optional list used within the transformations. NOTE: This must be used as an Rcpp::List. |
transformationGradientStepSize |
step size used to compute the gradients of the transformations |
Object of class modifyModel
modification <- modifyModel(addMeans = TRUE) # adds intercepts to a lavaan object # that was fitted without explicit intercepts
modification <- modifyModel(addMeans = TRUE) # adds intercepts to a lavaan object # that was fitted without explicit intercepts
assign new value to parameter tau used by approximate optimization. Any regularized value below tau will be evaluated as zeroed which directly impacts the AIC, BIC, etc.
newTau(regularizedSEM, tau)
newTau(regularizedSEM, tau)
regularizedSEM |
object fitted with approximate optimization |
tau |
new tau value |
regularizedSEM, but with new regularizedSEM@fits$nonZeroParameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- smoothLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 50)) newTau(regularizedSEM = lsem, tau = .1)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- smoothLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 50)) newTau(regularizedSEM = lsem, tau = .1)
plots the cross-validation fits
## S4 method for signature 'cvRegularizedSEM,missing' plot(x, y, ...)
## S4 method for signature 'cvRegularizedSEM,missing' plot(x, y, ...)
x |
object of class cvRegularizedSEM |
y |
not used |
... |
not used |
either an object of ggplot2 or of plotly
plots the regularized and unregularized parameters for all levels of lambda
## S4 method for signature 'gpRegularized,missing' plot(x, y, ...)
## S4 method for signature 'gpRegularized,missing' plot(x, y, ...)
x |
object of class gpRegularized |
y |
not used |
... |
use regularizedOnly=FALSE to plot all parameters |
either an object of ggplot2 or of plotly
plots the regularized and unregularized parameters for all levels of lambda
## S4 method for signature 'regularizedSEM,missing' plot(x, y, ...)
## S4 method for signature 'regularizedSEM,missing' plot(x, y, ...)
x |
object of class gpRegularized |
y |
not used |
... |
use regularizedOnly=FALSE to plot all parameters |
either an object of ggplot2 or of plotly
plots the regularized and unregularized parameters for all levels of the tuning parameters
## S4 method for signature 'stabSel,missing' plot(x, y, ...)
## S4 method for signature 'stabSel,missing' plot(x, y, ...)
x |
object of class stabSel |
y |
not used |
... |
use regularizedOnly=FALSE to plot all parameters |
either an object of ggplot2 or of plotly
Extract the labels of all regressions found in a lavaan model.
regressions(lavaanModel)
regressions(lavaanModel)
lavaanModel |
fitted lavaan model |
vector with parameter labels
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) regressions(fit)
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) regressions(fit)
helper function: regsem and lavaan use slightly different parameter labels. This function can be used to translate the parameter labels of a cv_regsem object to lavaan labels
regsem2LavaanParameters(regsemModel, lavaanModel)
regsem2LavaanParameters(regsemModel, lavaanModel)
regsemModel |
model of class regsem |
lavaanModel |
model of class lavaan |
regsem parameters with lavaan labels
## The following is adapted from ?regsem::regsem. #library(lessSEM) #library(regsem) ## put variables on same scale for regsem #HS <- data.frame(scale(HolzingerSwineford1939[,7:15])) # #mod <- ' #f =~ 1*x1 + l1*x2 + l2*x3 + l3*x4 + l4*x5 + l5*x6 + l6*x7 + l7*x8 + l8*x9 #' ## Recommended to specify meanstructure in lavaan #lavaanModel <- cfa(mod, HS, meanstructure=TRUE) # #regsemModel <- regsem(lavaanModel, # lambda = 0.3, # gradFun = "ram", # type="lasso", # pars_pen=c("l1", "l2", "l6", "l7", "l8")) # regsem2LavaanParameters(regsemModel = regsemModel, # lavaanModel = lavaanModel)
## The following is adapted from ?regsem::regsem. #library(lessSEM) #library(regsem) ## put variables on same scale for regsem #HS <- data.frame(scale(HolzingerSwineford1939[,7:15])) # #mod <- ' #f =~ 1*x1 + l1*x2 + l2*x3 + l3*x4 + l4*x5 + l5*x6 + l6*x7 + l7*x8 + l8*x9 #' ## Recommended to specify meanstructure in lavaan #lavaanModel <- cfa(mod, HS, meanstructure=TRUE) # #regsemModel <- regsem(lavaanModel, # lambda = 0.3, # gradFun = "ram", # type="lasso", # pars_pen=c("l1", "l2", "l6", "l7", "l8")) # regsem2LavaanParameters(regsemModel = regsemModel, # lavaanModel = lavaanModel)
Class for regularized SEM
penalty
penalty used (e.g., "lasso")
parameters
data.frame with parameter estimates
fits
data.frame with all fit results
parameterLabels
character vector with names of all parameters
weights
vector with weights given to each of the parameters in the penalty
regularized
character vector with names of regularized parameters
transformations
if the model has transformations, the transformed parameters are returned
internalOptimization
list of elements used internally
inputArguments
list with elements passed by the user to the general
notes
internal notes that have come up when fitting the model
Class for regularized SEM
penalty
penalty used (e.g., "lasso")
tuningParameterConfigurations
list with settings for the lambda, theta, and alpha tuning parameters.
parameters
data.frame with parameter estimates
fits
data.frame with all fit results
parameterLabels
character vector with names of all parameters
weights
vector with weights given to each of the parameters in the penalty
regularized
character vector with names of regularized parameters
transformations
if the model has transformations, the transformed parameters are returned
internalOptimization
list of elements used internally
inputArguments
list with elements passed by the user to the general
notes
internal notes that have come up when fitting the model
Implements ridge regularization for structural equation models. The penalty function is given by:
Note that ridge regularization will not set any of the parameters to zero but result in a shrinkage towards zero.
ridge( lavaanModel, regularized, lambdas, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
ridge( lavaanModel, regularized, lambdas, method = "glmnet", modifyModel = lessSEM::modifyModel(), control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlIsta and controlGlmnet functions. See ?controlIsta and ?controlGlmnet for more details. |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
Ridge regularization:
Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- ridge( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20)) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- ridge( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- ridge( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20)) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] #### Advanced ### # Switching the optimizer # # Use the "method" argument to switch the optimizer. The control argument # must also be changed to the corresponding function: lsemIsta <- ridge( lavaanModel = lavaanModel, regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), method = "ista", control = controlIsta()) # Note: The results are basically identical: lsemIsta@parameters - lsem@parameters
This function allows for regularization of models built in lavaan with the ridge penalty. Its elements can be accessed with the "@" operator (see examples).
ridgeBfgs( lavaanModel, regularized, lambdas = NULL, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
ridgeBfgs( lavaanModel, regularized, lambdas = NULL, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
For more details, see:
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # names of the regularized parameters: regularized = paste0("l", 6:15) lsem <- ridgeBfgs( # pass the fitted lavaan model lavaanModel = lavaanModel, regularized = regularized, lambdas = seq(0,1,length.out = 50)) plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,]
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # names of the regularized parameters: regularized = paste0("l", 6:15) lsem <- ridgeBfgs( # pass the fitted lavaan model lavaanModel = lavaanModel, regularized = regularized, lambdas = seq(0,1,length.out = 50)) plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,]
Implements scad regularization for structural equation models. The penalty function is given by:
where .
scad( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
scad( lavaanModel, regularized, lambdas, thetas, modifyModel = lessSEM::modifyModel(), method = "glmnet", control = lessSEM::controlGlmnet() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
thetas |
parameters whose absolute value is above this threshold will be penalized with a constant (theta) |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
method |
which optimizer should be used? Currently implemented are ista and glmnet. With ista, the control argument can be used to switch to related procedures (currently gist). |
control |
used to control the optimizer. This element is generated with the controlIsta (see ?controlIsta) |
Identical to regsem, models are specified using lavaan. Currently,
most standard SEM are supported. lessSEM also provides full information
maximum likelihood for missing data. To use this functionality,
fit your lavaan model with the argument sem(..., missing = 'ml')
.
lessSEM will then automatically switch to full information maximum likelihood
as well.
scad regularization:
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360. https://doi.org/10.1198/016214501753382273
Regularized SEM
Huang, P.-H., Chen, H., & Weng, L.-J. (2017). A Penalized Likelihood Method for Structural Equation Modeling. Psychometrika, 82(2), 329–354. https://doi.org/10.1007/s11336-017-9566-9
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
For more details on GLMNET, see:
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–20. https://doi.org/10.18637/jss.v033.i01
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., & Lin, C.-J. (2010). A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification. Journal of Machine Learning Research, 11, 3183–3234.
Yuan, G.-X., Ho, C.-H., & Lin, C.-J. (2012). An improved GLMNET for l1-regularized logistic regression. The Journal of Machine Learning Research, 13, 1999–2030. https://doi.org/10.1145/2020408.2020421
For more details on ISTA, see:
Beck, A., & Teboulle, M. (2009). A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1), 183–202. https://doi.org/10.1137/080716542
Gong, P., Zhang, C., Lu, Z., Huang, J., & Ye, J. (2013). A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems. Proceedings of the 30th International Conference on Machine Learning, 28(2)(2), 37–45.
Parikh, N., & Boyd, S. (2013). Proximal Algorithms. Foundations and Trends in Optimization, 1(3), 123–231.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- scad( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(2.01,5,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- scad( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), lambdas = seq(0,1,length.out = 20), thetas = seq(2.01,5,length.out = 5)) # the coefficients can be accessed with: coef(lsem) # if you are only interested in the estimates and not the tuning parameters, use coef(lsem)@estimates # or estimates(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # fit Measures: fitIndices(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") # or estimates(lsem, criterion = "AIC") # optional: plotting the paths requires installation of plotly # plot(lsem)
scadPenalty_C
scadPenalty_C(par, lambda_p, theta)
scadPenalty_C(par, lambda_p, theta)
par |
single parameter value |
lambda_p |
lambda value for this parameter |
theta |
theta value for this parameter |
penalty value
internal SEM representation
new
Creates a new SEMCpp.
fill
fills the SEM with the elements from an Rcpp::List
addTransformation
adds transforamtions to a model
implied
Computes implied means and covariance matrix
fit
Fits the model. Returns objective value of the fitting function
getParameters
Returns a data frame with model parameters.
getEstimator
returns the estimator used in the model (e.g., fiml)
getParameterLabels
Returns a vector with unique parameter labels as used internally.
getGradients
Returns a matrix with scores.
getScores
Returns a matrix with scores.
getHessian
Returns the hessian of the model. Expects the labels of the parameters and the values of the parameters as well as a boolean indicating if these are raw. Finally, a double (eps) controls the precision of the approximation.
computeTransformations
compute the transformations.
setTransformationGradientStepSize
change the step size of the gradient computation for the transformations
cvRegularizedSEM
.Show method for objects of class cvRegularizedSEM
.
## S4 method for signature 'cvRegularizedSEM' show(object)
## S4 method for signature 'cvRegularizedSEM' show(object)
object |
object of class cvRegularizedSEM |
No return value, just prints estimates
show
## S4 method for signature 'gpRegularized' show(object)
## S4 method for signature 'gpRegularized' show(object)
object |
object of class gpRegularized |
No return value, just prints estimates
show
## S4 method for signature 'lessSEMCoef' show(object)
## S4 method for signature 'lessSEMCoef' show(object)
object |
object of class lessSEMCoef |
No return value, just prints estimates
show
## S4 method for signature 'logLikelihood' show(object)
## S4 method for signature 'logLikelihood' show(object)
object |
object of class logLikelihood |
No return value, just prints estimates
show
## S4 method for signature 'Rcpp_mgSEM' show(object)
## S4 method for signature 'Rcpp_mgSEM' show(object)
object |
object of class Rcpp_mgSEM |
No return value, just prints estimates
show
## S4 method for signature 'Rcpp_SEMCpp' show(object)
## S4 method for signature 'Rcpp_SEMCpp' show(object)
object |
object of class Rcpp_SEMCpp |
No return value, just prints estimates
show
## S4 method for signature 'regularizedSEM' show(object)
## S4 method for signature 'regularizedSEM' show(object)
object |
object of class regularizedSEM |
No return value, just prints estimates
show
## S4 method for signature 'regularizedSEMMixedPenalty' show(object)
## S4 method for signature 'regularizedSEMMixedPenalty' show(object)
object |
object of class regularizedSEM |
No return value, just prints estimates
show
## S4 method for signature 'stabSel' show(object)
## S4 method for signature 'stabSel' show(object)
object |
object of class stabSel |
No return value, just prints estimates
simulate data for a simple CFA model
simulateExampleData( N = 100, loadings = c(rep(1, 5), rep(0.4, 5), rep(0, 5)), percentMissing = 0 )
simulateExampleData( N = 100, loadings = c(rep(1, 5), rep(0.4, 5), rep(0, 5)), percentMissing = 0 )
N |
number of persons in the data set |
loadings |
loadings of the latent variable on the manifest observations |
percentMissing |
percentage of missing data |
data set for a single-factor CFA.
y <- lessSEM::simulateExampleData()
y <- lessSEM::simulateExampleData()
This function allows for regularization of models built in lavaan with the smooth adaptive lasso penalty. The returned object is an S4 class; its elements can be accessed with the "@" operator (see examples).
smoothAdaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas, epsilon, tau, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
smoothAdaptiveLasso( lavaanModel, regularized, weights = NULL, lambdas, epsilon, tau, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
weights |
labeled vector with weights for each of the parameters in the model. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object. If set to NULL, the default weights will be used: the inverse of the absolute values of the unregularized parameter estimates |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
tau |
parameters below threshold tau will be seen as zeroed |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
For more details, see:
Zou, H. (2006). The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101(476), 1418–1429. https://doi.org/10.1198/016214506000000735
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Lee, S.-I., Lee, H., Abbeel, P., & Ng, A. Y. (2006). Efficient L1 Regularized Logistic Regression. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 401–408.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # names of the regularized parameters: regularized = paste0("l", 6:15) # define adaptive lasso weights: # We use the inverse of the absolute unregularized parameters # (this is the default in adaptiveLasso and can also specified # by setting weights = NULL) weights <- 1/abs(getLavaanParameters(lavaanModel)) weights[!names(weights) %in% regularized] <- 0 lsem <- smoothAdaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, regularized = regularized, weights = weights, epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 50)) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # AIC and BIC: AIC(lsem) BIC(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") coef(lsem, criterion = "BIC")
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # names of the regularized parameters: regularized = paste0("l", 6:15) # define adaptive lasso weights: # We use the inverse of the absolute unregularized parameters # (this is the default in adaptiveLasso and can also specified # by setting weights = NULL) weights <- 1/abs(getLavaanParameters(lavaanModel)) weights[!names(weights) %in% regularized] <- 0 lsem <- smoothAdaptiveLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, regularized = regularized, weights = weights, epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 50)) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # AIC and BIC: AIC(lsem) BIC(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") coef(lsem, criterion = "BIC")
This function allows for regularization of models built in lavaan with the smooth elastic net penalty. Its elements can be accessed with the "@" operator (see examples).
smoothElasticNet( lavaanModel, regularized, lambdas = NULL, nLambdas = NULL, alphas, epsilon, tau, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
smoothElasticNet( lavaanModel, regularized, lambdas = NULL, nLambdas = NULL, alphas, epsilon, tau, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
nLambdas |
alternative to lambda: If alpha = 1, lessSEM can automatically compute the first lambda value which sets all regularized parameters to zero. It will then generate nLambda values between 0 and the computed lambda. |
alphas |
numeric vector with values of the tuning parameter alpha. Must be between 0 and 1. 0 = ridge, 1 = lasso. |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
tau |
parameters below threshold tau will be seen as zeroed |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
For more details, see:
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x for the details of this regularization technique.
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Lee, S.-I., Lee, H., Abbeel, P., & Ng, A. Y. (2006). Efficient L1 Regularized Logistic Regression. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 401–408.
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # names of the regularized parameters: regularized = paste0("l", 6:15) lsem <- smoothElasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, regularized = regularized, epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,]
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: # names of the regularized parameters: regularized = paste0("l", 6:15) lsem <- smoothElasticNet( # pass the fitted lavaan model lavaanModel = lavaanModel, regularized = regularized, epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 5), alphas = seq(0,1,length.out = 3)) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,]
This function allows for regularization of models built in lavaan with the smoothed lasso penalty. The returned object is an S4 class; its elements can be accessed with the "@" operator (see examples). We don't recommend using this function. Use lasso() instead.
smoothLasso( lavaanModel, regularized, lambdas, epsilon, tau, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
smoothLasso( lavaanModel, regularized, lambdas, epsilon, tau, modifyModel = lessSEM::modifyModel(), control = lessSEM::controlBFGS() )
lavaanModel |
model of class lavaan |
regularized |
vector with names of parameters which are to be regularized. If you are unsure what these parameters are called, use getLavaanParameters(model) with your lavaan model object |
lambdas |
numeric vector: values for the tuning parameter lambda |
epsilon |
epsilon > 0; controls the smoothness of the approximation. Larger values = smoother |
tau |
parameters below threshold tau will be seen as zeroed |
modifyModel |
used to modify the lavaanModel. See ?modifyModel. |
control |
used to control the optimizer. This element is generated with the controlBFGS function. See ?controlBFGS for more details. |
For more details, see:
Lee, S.-I., Lee, H., Abbeel, P., & Ng, A. Y. (2006). Efficient L1 Regularized Logistic Regression. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI-06), 401–408.
Jacobucci, R., Grimm, K. J., & McArdle, J. J. (2016). Regularized Structural Equation Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 23(4), 555–566. https://doi.org/10.1080/10705511.2016.1154793
Model of class regularizedSEM
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- smoothLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 50)) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # AIC and BIC: AIC(lsem) BIC(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") coef(lsem, criterion = "BIC")
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Regularization: lsem <- smoothLasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), epsilon = 1e-10, tau = 1e-4, lambdas = seq(0,1,length.out = 50)) # use the plot-function to plot the regularized parameters: plot(lsem) # the coefficients can be accessed with: coef(lsem) # elements of lsem can be accessed with the @ operator: lsem@parameters[1,] # AIC and BIC: AIC(lsem) BIC(lsem) # The best parameters can also be extracted with: coef(lsem, criterion = "AIC") coef(lsem, criterion = "BIC")
Provides rudimentary stability selection for regularized SEM. Stability selection has been proposed by Meinshausen & Bühlmann (2010) and was extended to SEM by Li & Jacobucci (2021). The problem that stabiltiy selection tries to solve is the instability of regularization procedures: Small changes in the data set may result in different parameters being selected. To address this issue, stability selection uses random subsamples from the initial data set and fits models in these subsamples. For each parameter, we can now check how often it is included in the model for a given set of tuning parameters. Plotting these probabilities can provide an overview over which of the parameters are often removed and which remain in the model most of the time. To get a final selection, a threshold t can be defined: If a parameter is in the model t% of the time, it is retained.
stabilitySelection( modelSpecification, subsampleSize, numberOfSubsamples = 100, threshold = 70, maxTries = 10 * numberOfSubsamples )
stabilitySelection( modelSpecification, subsampleSize, numberOfSubsamples = 100, threshold = 70, maxTries = 10 * numberOfSubsamples )
modelSpecification |
a call to one of the penalty functions in lessSEM. See examples for details |
subsampleSize |
number of subjects in each subsample. Must be smaller than the number of subjects in the original data set |
numberOfSubsamples |
number of times the procedure should subsample and recompute the model. According to Meinshausen & Bühlmann (2010), 100 seems to work quite well and is also the default in regsem |
threshold |
percentage of models, where the parameter should be contained in order to be in the final model |
maxTries |
fitting models in a subset may fail. maxTries sets the maximal number of subsets to try. |
estimates for each subsample and aggregated percentages for each parameter
Li, X., & Jacobucci, R. (2021). Regularized structural equation modeling with stability selection. Psychological Methods, 27(4), 497–518. https://doi.org/10.1037/met0000389
Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Stability selection stabSel <- stabilitySelection( # IMPORTANT: Wrap your call to the penalty function in an rlang::expr-Block: modelSpecification = rlang::expr( lasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), # in case of lasso and adaptive lasso, we can specify the number of lambda # values to use. lessSEM will automatically find lambda_max and fit # models for nLambda values between 0 and lambda_max. For the other # penalty functions, lambdas must be specified explicitly nLambdas = 50) ), subsampleSize = 80, numberOfSubsamples = 5, # should be set to a much higher number (e.g., 100) threshold = 70 ) stabSel plot(stabSel)
library(lessSEM) # Identical to regsem, lessSEM builds on the lavaan # package for model specification. The first step # therefore is to implement the model in lavaan. dataset <- simulateExampleData() lavaanSyntax <- " f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 + l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 + l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15 f ~~ 1*f " lavaanModel <- lavaan::sem(lavaanSyntax, data = dataset, meanstructure = TRUE, std.lv = TRUE) # Stability selection stabSel <- stabilitySelection( # IMPORTANT: Wrap your call to the penalty function in an rlang::expr-Block: modelSpecification = rlang::expr( lasso( # pass the fitted lavaan model lavaanModel = lavaanModel, # names of the regularized parameters: regularized = paste0("l", 6:15), # in case of lasso and adaptive lasso, we can specify the number of lambda # values to use. lessSEM will automatically find lambda_max and fit # models for nLambda values between 0 and lambda_max. For the other # penalty functions, lambdas must be specified explicitly nLambdas = 50) ), subsampleSize = 80, numberOfSubsamples = 5, # should be set to a much higher number (e.g., 100) threshold = 70 ) stabSel plot(stabSel)
Class for stability selection
regularized
names of regularized parameters
tuningParameters
data.frame with tuning parameter values
stabilityPaths
matrix with percentage of parameters being non-zero averaged over all subsets for each setting of the tuning parameters
percentSelected
percentage with which a parameter was selected over all tuning parameter settings
selectedParameters
final selected parameters
settings
internal
cvRegularizedSEM
.summary method for objects of class cvRegularizedSEM
.
## S4 method for signature 'cvRegularizedSEM' summary(object, ...)
## S4 method for signature 'cvRegularizedSEM' summary(object, ...)
object |
object of class cvRegularizedSEM |
... |
not used |
No return value, just prints estimates
summary
## S4 method for signature 'gpRegularized' summary(object, ...)
## S4 method for signature 'gpRegularized' summary(object, ...)
object |
object of class gpRegularized |
... |
not used |
No return value, just prints estimates
summary
## S4 method for signature 'regularizedSEM' summary(object, ...)
## S4 method for signature 'regularizedSEM' summary(object, ...)
object |
object of class regularizedSEM |
... |
not used |
No return value, just prints estimates
summary
## S4 method for signature 'regularizedSEMMixedPenalty' summary(object, ...)
## S4 method for signature 'regularizedSEMMixedPenalty' summary(object, ...)
object |
object of class regularizedSEMMixedPenalty |
... |
not used |
No return value, just prints estimates
Extract the labels of all variances found in a lavaan model.
variances(lavaanModel)
variances(lavaanModel)
lavaanModel |
fitted lavaan model |
vector with parameter labels
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) variances(fit)
# The following is adapted from ?lavaan::sem library(lessSEM) model <- ' # latent variable definitions ind60 =~ x1 + x2 + x3 dem60 =~ y1 + a*y2 + b*y3 + c*y4 dem65 =~ y5 + a*y6 + b*y7 + c*y8 # regressions dem60 ~ ind60 dem65 ~ ind60 + dem60 # residual correlations y1 ~~ y5 y2 ~~ y4 + y6 y3 ~~ y7 y4 ~~ y8 y6 ~~ y8 ' fit <- sem(model, data = PoliticalDemocracy) variances(fit)