🔥 40% Off Crucial Exams Memberships — This Week Only

3 days, 6 hours remaining!

CompTIA Data+ DA0-002 (V2) Practice Question

While preparing a customer-segmentation dataset for a k-means clustering project, a data analyst notices that the variable annual_spend ranges from 0 to 250 000 dollars, whereas survey_score ranges only from 1 to 5. To keep the larger-scale variable from dominating the Euclidean distance calculation, the analyst wants to re-express each value as the number of standard deviations it lies from that variable's mean. Which data-transformation technique should be applied before running the algorithm?

  • Convert every numerical feature to its z-score by subtracting the mean and dividing by the standard deviation.

  • Apply a base-10 logarithm to each feature to compress its scale.

  • Rescale each feature to the interval 0-1 using its minimum and maximum values.

  • Replace extreme values in each feature with the 5th- and 95th-percentile values.

CompTIA Data+ DA0-002 (V2)
Data Acquisition and Preparation
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot