CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is preparing a manufacturing data set for a k-nearest neighbors (k-NN) model that uses Euclidean distance. The data contain two continuous variables: AnnualEnergy_kWh, with a range of 0 to 12,000, and MaintenanceDowntime_min, with a range of 0 to 7,200.

During pilot runs, the distance metric is dominated by AnnualEnergy_kWh, causing records with high downtime to be mis-classified. According to best practice for normalization, which preprocessing step should the data scientist apply before training so that both variables contribute proportionally to the distance calculation?

  • Apply a natural logarithm transform (log1p) to every value in both features.

  • Standardize each feature to zero mean and unit variance (z-score).

  • Generate polynomial cross-terms between the two features and include them in the model.

  • Rescale each feature to the 0-1 interval using min-max normalization.

CompTIA DataX DY0-001 (V1)
Modeling, Analysis, and Outcomes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot