CompTIA DataX DY0-001 (V1) Practice Question

A data scientist has developed a multiple linear regression model to predict housing prices. After the initial training, the scientist examines the model's performance by creating a residual vs. fitted values plot. The plot reveals that the residuals are not randomly scattered around the zero line; instead, they form a distinct, parabolic (U-shaped) pattern. What is the most likely issue with the model, and what is the most appropriate next step in the model design iteration process?

The plot reveals multicollinearity among the predictor variables. The next step should be to calculate the Variance Inflation Factor (VIF) for each feature and consider removing highly correlated predictors.
The model exhibits non-linearity, indicating it fails to capture the underlying structure of the data. The next step should be to use feature engineering to create polynomial terms for the relevant predictors.
The model is likely overfitting the training data. The next step should be to increase the L2 regularization penalty (e.g., in a Ridge regression) to reduce the model's complexity.
The plot shows evidence of heteroscedasticity, meaning the variance of the errors is not constant. The next step should be to apply a Box-Cox transformation to the response variable to stabilize the variance.

Report Issue

Answer Description

The correct option identifies non-linearity as the issue and suggests creating polynomial features as the solution. A parabolic or U-shaped pattern in a residual vs. fitted values plot is a classic indicator that the linear model is failing to capture a non-linear relationship in the data. This is a form of underfitting, where the model is too simple. The appropriate corrective action is to engineer new features that can account for this curvature, such as adding squared or cubic terms of the existing predictors (polynomial features).

The option suggesting heteroscedasticity is incorrect because heteroscedasticity typically appears as a cone or fan shape in the residual plot, where the spread of residuals changes as the fitted values increase or decrease. While a Box-Cox transformation is a valid technique to address non-constant variance, it is not the primary solution for the U-shaped pattern described.

The option suggesting multicollinearity is incorrect because multicollinearity, the correlation between predictor variables, is not diagnosed using a residual vs. fitted plot. It is typically identified using a correlation matrix or by calculating the Variance Inflation Factor (VIF).

The option suggesting overfitting is incorrect. A U-shaped residual plot indicates underfitting (the model is too simple to capture the underlying pattern), not overfitting. Increasing regularization would further simplify the model, likely worsening the issue.

Ask Bash

Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.

What does a U-shaped pattern in a residual vs. fitted values plot signify?

Open an interactive chat with Bash

What are polynomial features in machine learning?

Open an interactive chat with Bash

Why is multicollinearity not diagnosed using a residual vs. fitted values plot?

Open an interactive chat with Bash

CompTIA DataX DY0-001 (V1)

Modeling, Analysis, and Outcomes

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

What does a U-shaped pattern in a residual vs. fitted values plot signify?

What are polynomial features in machine learning?

Why is multicollinearity not diagnosed using a residual vs. fitted values plot?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams