A risk-analytics team is retraining a credit-default logistic regression that contains more than 200 highly correlated applicant attributes.
With pure L2 (ridge) regularization the model keeps almost every feature, making it hard to interpret. Switching to pure L1 (lasso) causes most coefficients to be driven to zero, but recall drops because only one variable from each correlated group survives. The team needs a regularization approach that (1) still shrinks coefficients to control variance, (2) can keep several correlated predictors that all carry signal, (3) can eliminate truly irrelevant variables, and (4) exposes a hyperparameter to dial the trade-off between these behaviors.
Which regularization technique best meets these requirements?
Increase the penalty term in ridge (L2) regression
Apply adaptive lasso so that weights guide variable selection
Use early stopping when the validation loss stops improving
Elastic net regularization with a tunable mixing parameter α
Elastic net includes both an L2 term (which shrinks correlated coefficients toward each other and stabilizes their estimates) and an L1 term (which can set uninformative coefficients exactly to zero). The mixing parameter α (0 ≤ α ≤ 1) lets practitioners tune the balance: α → 0 behaves like ridge, α → 1 behaves like lasso, and intermediate values retain multiple correlated features while still providing sparsity. Ridge alone never sets coefficients to zero; adaptive lasso still discards most members of a correlated group; early stopping is a training-time regularizer for iterative models but does not address coefficient shrinkage or feature selection in generalized linear models.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.