A data scientist is developing a linear regression model to predict a company's sales based on its advertising expenditure. After fitting an initial Ordinary Least Squares (OLS) model, an analysis of the model's residuals reveals a distinct pattern: the variance of the residuals increases as the predicted sales figures get larger. This pattern suggests that the model's predictions are less reliable for higher sales volumes. Given this specific diagnostic finding, which of the following modeling adjustments is the most appropriate next step to improve the model's reliability?
Implement a Weighted Least Squares (WLS) regression.
Implement a LASSO regression for feature selection.
Continue using Ordinary Least Squares (OLS) as the estimates remain unbiased.
Implement a Ridge regression to penalize large coefficients.
The correct answer is to implement Weighted Least Squares (WLS) regression. The scenario describes heteroscedasticity, a condition where the variance of the error term is not constant across all observations. This violates a key assumption of Ordinary Least Squares (OLS) regression, which can lead to inefficient and unreliable coefficient estimates. WLS is specifically designed to address heteroscedasticity by assigning a weight to each observation, with lower weights given to observations that have higher variance. This process helps to stabilize the variance and produce more reliable and efficient parameter estimates.
OLS is inappropriate because its assumption of constant variance (homoscedasticity) is violated, as indicated by the increasing variance of residuals.
Ridge and LASSO regressions are regularization techniques primarily used to address multicollinearity and prevent overfitting by penalizing large coefficients; they do not directly correct for heteroscedasticity.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is heteroscedasticity in regression models?
Open an interactive chat with Bash
How does Weighted Least Squares (WLS) regression address heteroscedasticity?
Open an interactive chat with Bash
Why don’t Ridge or LASSO regressions solve heteroscedasticity?