CompTIA DataX DY0-001 (V1) Practice Question

A data-science team is developing a binary classifier that predicts equipment failure seven days ahead from two years of hourly sensor readings. The engineer follows this workflow:

(1) remove rows that contain any null sensor value; (2) compute a 24-hour rolling mean for every sensor and append it as a new feature; (3) randomly split the resulting data into 80 % training and 20 % test sets; (4) fit a StandardScaler on the training split and apply the scaler to both splits; (5) train a gradient-boosting classifier; (6) evaluate accuracy on the test split.

The offline test accuracy is 0.93, but the model's accuracy on live streaming data drops to 0.64.

Which single step in this workflow is the most likely cause of the data-leakage that explains the performance drop, and why?

  • Step (2) - Computing the 24-hour rolling mean before the split; the feature engineering leaks test values into training features and inflates accuracy.

  • Step (4) - Scaling the data with StandardScaler fitted on the training split; this is the correct way to scale and does not cause leakage.

  • Step (3) - Randomly splitting time-stamped data; this puts future observations in the training set and lets the model learn about events that occur after some test instances, creating temporal data leakage.

  • Step (1) - Eliminating rows with missing readings; this reduces sample size but does not provide the model with information about future failures.

CompTIA DataX DY0-001 (V1)
Machine Learning
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot