CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is developing a regression model to predict housing prices. The initial model, a high-degree polynomial regression, achieves a near-perfect R-squared score on the training dataset but performs poorly on the validation dataset, exhibiting a significantly higher Mean Squared Error (MSE). This discrepancy indicates the model has high variance and is overfitting. Which of the following strategies, by directly modifying the loss function, is the most effective approach to minimize the model's variance and improve generalization?

  • Increase the polynomial degree of the model to better capture the complex underlying patterns in the data.

  • Apply k-means clustering to the feature set to generate a new categorical variable for the model.

  • Replace the Mean Squared Error (MSE) loss function with Mean Absolute Error (MAE) to reduce the impact of outliers.

  • Augment the Ordinary Least Squares (OLS) loss function with an L2 regularization term to penalize large coefficient values.

CompTIA DataX DY0-001 (V1)
Machine Learning
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot