CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is analyzing a large dataset of customer order transactions for an e-commerce company. They identify a single transaction with a 'quantity ordered' value that is several orders of magnitude higher than any other transaction in the dataset. This value significantly skews the distribution. Which of the following is the most appropriate initial step to determine if this outlier is a valid data point or an error?

  • Calculate the Z-score for all 'quantity ordered' values and immediately remove any data point with a score greater than 3, as it is a statistical outlier.

  • Conclude it is a data entry error and replace the value using median imputation to normalize the distribution.

  • Apply Winsorization to the 'quantity ordered' column, capping the extreme value at the 99th percentile to reduce its influence on subsequent analysis.

  • Cross-reference the transaction ID with related datasets, such as inventory logs or customer purchase history, to verify the order's legitimacy.

CompTIA DataX DY0-001 (V1)
Operations and Processes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot