CompTIA DataX DY0-001 (V1) Practice Question

A data science team is developing a fraud detection model using a Gradient Boosting Machine (GBM) on a large dataset with thousands of features. After training, the model achieves 99.8% accuracy on the training set but only 85% accuracy on a held-out validation set. The training loss is near zero, while the validation loss is substantially higher and was observed to increase after a certain number of boosting rounds. Given this significant performance gap, which of the following BEST describes the phenomenon the model is exhibiting and the most effective initial step to address it?

  • The model is overfitting to the training data. The most effective initial step is to apply regularization techniques, such as increasing the reg_lambda or reg_alpha hyperparameters, or to reduce the complexity of the model by limiting the maximum tree depth.

  • The model is suffering from data leakage. The team should re-evaluate the feature engineering and data splitting process to ensure a strict separation of data before any transformations are applied.

  • The model is underfitting the data. The best course of action is to increase the model's complexity by adding more estimators (trees) or allowing for deeper trees to better capture the data's patterns.

  • The validation set is exhibiting concept drift. The team should acquire more recent data for validation and consider implementing a drift detection mechanism before retraining.

CompTIA DataX DY0-001 (V1)
Machine Learning
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot