CompTIA DataX DY0-001 (V1) Practice Question

A machine learning engineer is developing a regression model using a gradient boosting algorithm for a task with known non-linear patterns. To mitigate potential overfitting, the engineer configured the model with a high L2 regularization (lambda) value and constrained the tree max_depth to 2. The performance metrics are as follows:

  • Training Set Root Mean Squared Error (RMSE): 148.5
  • Validation Set Root Mean Squared Error (RMSE): 150.1

A simple baseline model that predicts the mean of the target variable has an RMSE of 155.0 on the same validation set. The engineer observes that the model's performance on both the training and validation sets is poor and only marginally better than the baseline. This indicates that the model is not capturing the underlying structure of the data. Which of the following is the most effective strategy to address this specific issue?

  • Implement k-fold cross-validation and increase the L2 regularization parameter.

  • Increase the max_depth of the trees and decrease the L2 regularization parameter.

  • Apply the Synthetic Minority Oversampling Technique (SMOTE) to the training data.

  • Add more features to the dataset while keeping the current max_depth and regularization.

CompTIA DataX DY0-001 (V1)
Machine Learning
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot