A retail analyst is preparing a report on customer spending behavior. The transaction table contains a numeric column AmountSpent, which is right-skewed because a small number of customers make very large purchases. Roughly 8 % of the rows have NULL values in that column, and the analyst must replace the NULLs before calculating the median and quartiles for an upcoming dashboard. The business wants a quick, single-value imputation that minimizes the bias introduced by the skew while still reflecting the variable's central tendency. Which imputation strategy best meets these requirements?
Replace missing values with the column mean calculated from the non-NULL data.
Replace missing values with the column median calculated from the non-NULL data.
Use forward fill (last observation carried forward) based on transaction date.
Replace missing values with zero to avoid overstating revenue.
Median imputation is generally preferred for a right-skewed numeric variable because the median is resistant to extreme values in the long tail. Replacing NULLs with the mean would pull the imputed values toward the large outliers and artificially raise the typical spending level. Inserting zeros would severely distort both the center and the spread of the data and understate revenue. Forward-fill (last observation carried forward) is designed for ordered time-series data and makes the unrealistic assumption that a customer's previous purchase amount persists unchanged; it does not address the underlying skewness. Therefore, using the column median calculated from the non-NULL observations is the most appropriate single-value imputation in this scenario.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is the median preferred over the mean for replacing NULLs in a right-skewed dataset?
Open an interactive chat with Bash
What does 'NULL' signify in a dataset, and why must it be addressed before analysis?
Open an interactive chat with Bash
What are quartiles, and how are they affected by missing values in the data?
Open an interactive chat with Bash
CompTIA Data+ DA0-002 (V2)
Data Acquisition and Preparation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .