During an A/B experiment, a data scientist measures the time (in seconds) that users spend on a web page. The control group (n = 45) and the treatment group with a new layout (n = 38) are independent samples. A Shapiro-Wilk normality test returns p-values of 0.12 (control) and 0.09 (treatment). Levene's test for equality of variances returns a p-value of 0.015. At α = 0.05, which hypothesis test should the data scientist choose to determine whether the mean time on page differs between the two groups?
Pooled (Student's) two-sample t-test assuming equal variances
The Shapiro-Wilk results (p > 0.05 for both groups) provide no evidence against normality, so a parametric test that compares means is appropriate. However, Levene's test indicates unequal variances (p = 0.015 < 0.05). When the normality assumption holds but the equal-variance assumption does not, the most powerful parametric choice is Welch's two-sample t-test, which does not pool variances and adjusts the degrees of freedom. A pooled Student's t-test would violate the variance assumption, a paired t-test is only valid for dependent samples, and the Mann-Whitney U test is less powerful than Welch's t-test when normality is satisfied.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What does the Shapiro-Wilk test measure?
Open an interactive chat with Bash
Why does Levene's test indicate unequal variances?
Open an interactive chat with Bash
Why is the Welch t-test preferred over the Mann-Whitney U test here?