CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is preparing a manufacturing data set for a k-nearest neighbors (k-NN) model that uses Euclidean distance. The data contain two continuous variables: AnnualEnergy_kWh, with a range of 0 to 12,000, and MaintenanceDowntime_min, with a range of 0 to 7,200.

During pilot runs, the distance metric is dominated by AnnualEnergy_kWh, causing records with high downtime to be mis-classified. According to best practice for normalization, which preprocessing step should the data scientist apply before training so that both variables contribute proportionally to the distance calculation?

Generate polynomial cross-terms between the two features and include them in the model.
Rescale each feature to the 0-1 interval using min-max normalization.
Apply a natural logarithm transform (log1p) to every value in both features.
Standardize each feature to zero mean and unit variance (z-score).

CompTIA DataX DY0-001 (V1)

Modeling, Analysis, and Outcomes

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

Why is min-max normalization preferred over standardization for k-NN with Euclidean distance?

How does Euclidean distance work in k-NN, and why do variable ranges matter?

What is the difference between min-max normalization and a log transform?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

Why is min-max normalization preferred over standardization for k-NN with Euclidean distance?

How does Euclidean distance work in k-NN, and why do variable ranges matter?

What is the difference between min-max normalization and a log transform?

Report Issue