CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is analyzing a large dataset of customer order transactions for an e-commerce company. They identify a single transaction with a 'quantity ordered' value that is several orders of magnitude higher than any other transaction in the dataset. This value significantly skews the distribution. Which of the following is the most appropriate initial step to determine if this outlier is a valid data point or an error?

Conclude it is a data entry error and replace the value using median imputation to normalize the distribution.
Apply Winsorization to the 'quantity ordered' column, capping the extreme value at the 99th percentile to reduce its influence on subsequent analysis.
Calculate the Z-score for all 'quantity ordered' values and immediately remove any data point with a score greater than 3, as it is a statistical outlier.
Cross-reference the transaction ID with related datasets, such as inventory logs or customer purchase history, to verify the order's legitimacy.

CompTIA DataX DY0-001 (V1)

Operations and Processes

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

What is an outlier in a dataset?

How do you cross-reference data to validate an outlier?

Why can't statistical techniques like Winsorization or Z-score removal always handle outliers?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

What is an outlier in a dataset?

How do you cross-reference data to validate an outlier?

Why can't statistical techniques like Winsorization or Z-score removal always handle outliers?

Report Issue