A data science team at a financial services company is developing a model to predict the probability of loan default. The dataset contains over 100 features, and exploratory data analysis reveals strong multicollinearity among several predictors, such as "debt-to-income ratio" and "credit utilization rate". The primary business objective is to create a predictive model that is also parsimonious, automatically performing feature selection to identify the most significant predictors of default. Which of the following models is most appropriate for this task?
Linear Discriminant Analysis (LDA)
Ordinary Least Squares (OLS) regression
Least Absolute Shrinkage and Selection Operator (LASSO) regression
The correct answer is LASSO (Least Absolute Shrinkage and Selection Operator) regression. This model is the most suitable choice because it meets both of the scenario's requirements: handling multicollinearity and performing automatic feature selection. LASSO uses L1 regularization, which adds a penalty term proportional to the absolute value of the coefficients. A key feature of this penalty is its ability to shrink the coefficients of less important features to exactly zero, effectively removing them from the model. This results in a simpler, more interpretable (parsimonious) model that highlights the most significant predictors, which aligns perfectly with the business objective.
Ridge regression is incorrect because while it effectively handles multicollinearity by using L2 regularization to shrink coefficients, it does not perform feature selection. The coefficients are shrunk towards zero but never become exactly zero, meaning all features are retained in the final model.
Ordinary Least Squares (OLS) regression is incorrect because it performs poorly in the presence of strong multicollinearity. Multicollinearity violates one of the key assumptions of OLS, leading to unstable coefficient estimates with high variance, making the model unreliable for interpretation and prediction.
Linear Discriminant Analysis (LDA) is incorrect because it is primarily a classification and dimensionality reduction technique, not a regression method designed for this type of feature selection problem. While LDA reduces dimensions by finding linear combinations of features that best separate classes, it does not provide a parsimonious regression model in the way that LASSO does.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is multicollinearity and why is it a problem in regression models?
Open an interactive chat with Bash
How does LASSO regression perform feature selection compared to Ridge regression?
Open an interactive chat with Bash
Why is Ordinary Least Squares (OLS) unsuitable for datasets with multicollinearity?