During a model-design iteration of a multiple linear regression that predicts daily building energy consumption, you review the four standard diagnostic plots. The Normal Q-Q plot of standardized residuals is nearly linear, but the Scale-Location (spread-location) plot displays a pronounced fan shape in which the spread of residuals widens as the fitted values increase. Based on this evidence, which action should you prioritize in the next model iteration?
Standard-scale all predictor variables to zero mean and unit variance and then refit the ordinary least-squares model.
Remove observations with large Cook's distance values to reduce leverage effects before refitting the same model.
Introduce higher-order polynomial terms for each predictor to capture possible non-linear relationships.
Refit the model using a variance-stabilizing transformation (such as a log or Box-Cox) or weighted least squares to make the residual variance approximately constant.
A fan-shaped Scale-Location plot indicates heteroscedasticity-error variance grows with the predicted value. The most direct way to remove this violation of the homoscedasticity assumption is to transform the response with a variance-stabilizing transformation (for example, a log or Box-Cox power transform) or to refit the model with weighted least squares so that observations with larger error variance receive less influence. Adding polynomial terms targets non-linearity rather than unequal variance, deleting high-leverage points deals with influence not heteroscedasticity, and merely standardizing predictors leaves the residual variance pattern unchanged. Therefore, the transformation or heteroscedasticity-aware re-estimation is the correct next step.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is heteroscedasticity in regression analysis?
Open an interactive chat with Bash
How does a variance-stabilizing transformation like a log or Box-Cox work?
Open an interactive chat with Bash
What is weighted least squares (WLS) and how does it address heteroscedasticity?