CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is preparing a dataset for a K-Means clustering algorithm, which uses Euclidean distance to group data points. The dataset includes customer_age (ranging from 18 to 85) and annual_income (ranging from 25,000 to 250,000 in USD). If the data scientist proceeds without applying any feature scaling, what is the most likely impact on the model's performance?

  • The Euclidean distance metric will automatically normalize the features, resulting in balanced clusters.

  • The clustering algorithm will fail to execute because the features have different units and scales.

  • The annual_income feature will disproportionately influence the distance calculations, minimizing the effect of customer_age.

  • The customer_age feature will dominate the clustering process because of its smaller numerical range.

CompTIA DataX DY0-001 (V1)
Mathematics and Statistics
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot