CompTIA DataX DY0-001 (V1) Practice Question

A machine learning engineer is tasked with building a model and estimating its generalization error. They use a single loop of k-fold cross-validation. In each fold, they perform hyperparameter tuning using grid search on the training data, identify the best parameters, and then evaluate the model with these parameters on the validation set. The final performance is reported as the average of the scores across all folds. This process results in a model that performs exceptionally well during this cross-validation procedure but fails to generalize to new production data. Which of the following is the most likely cause for this discrepancy?

  • Standard k-fold cross-validation is only appropriate for regression models, and a stratified approach should have been used for this classification task.

  • The process causes information leakage, leading to an optimistic performance estimate because the validation data influences both hyperparameter selection and performance evaluation.

  • The model is underfitting due to the reduced size of the training partition created in each fold of the cross-validation process.

  • The grid search for hyperparameter tuning is computationally inefficient and likely resulted in a globally suboptimal model.

CompTIA DataX DY0-001 (V1)
Machine Learning
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot