You are exploring a numeric feature that records the total dollar value of every day's online sales (n ≈ 250 000 observations). A quick look at the raw values shows a strongly right-skewed distribution with several extreme revenue spikes. In your first round of univariate analysis you have two goals:
Draw a graphic that instantly highlights the extreme days without making distributional assumptions.
Report a single number that summarizes spread but will not be pulled upward by those spikes.
Which pair of techniques satisfies both goals most appropriately?
Kernel density estimate accompanied by the coefficient of variation
Q-Q plot against the normal distribution and the sample variance
Histogram with Scott's-rule binning and the sample standard deviation
Box-and-whisker plot together with the interquartile range (IQR)
A box-and-whisker plot is a non-parametric graphic that visualises the five-number summary and automatically displays observations beyond the whiskers as individual points, making extreme values immediately visible. The interquartile range (IQR) is calculated from the 25th and 75th percentiles, so it ignores the highest and lowest 25 % of observations and therefore resists the influence of outliers. Together they meet both stated goals.
The histogram/standard-deviation option can illustrate shape, but the choice of bin width can hide or exaggerate outliers, and the standard deviation is highly sensitive to extreme values. A Q-Q plot plus variance focuses on assessing normality, not on quickly flagging outliers, and variance is even more sensitive than the standard deviation. A kernel density estimate with the coefficient of variation provides no built-in outlier markers, and the CV inherits the outlier-sensitivity of the mean and standard deviation.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is the interquartile range (IQR) resistant to outliers?
Open an interactive chat with Bash
What makes a box-and-whisker plot effective for identifying outliers?
Open an interactive chat with Bash
What are the limitations of standard deviation for measuring spread in skewed data?