A data scientist is analyzing the performance of a new machine learning model designed to optimize ad spend. Due to budget constraints, they could only run a pilot test on 15 randomly selected advertising campaigns. The scientist calculates the mean improvement in return on ad spend (ROAS) for this sample and notes that the population standard deviation of ROAS improvement is unknown. To construct a 95% confidence interval for the true mean ROAS improvement, which of the following statements most accurately describes the appropriate distribution to use and its justification?
A Chi-squared distribution is appropriate because the goal is to test the goodness-of-fit of the observed ROAS improvements against an expected distribution.
A Binomial distribution is appropriate by classifying each campaign as a 'success' or 'failure' to model the probability of a positive ROAS improvement.
A standard normal (Z) distribution is appropriate because the Central Limit Theorem ensures the sampling distribution of the mean is approximately normal.
A t-distribution with 14 degrees of freedom is appropriate because the population standard deviation is unknown and must be estimated from a small sample, which introduces additional uncertainty.
The correct answer is to use a t-distribution with 14 degrees of freedom. The Student's t-distribution is the appropriate statistical distribution for estimating the population mean when the sample size is small (typically n < 30) and the population standard deviation is unknown. The degrees of freedom for a single sample confidence interval are calculated as n - 1, which is 15 - 1 = 14 in this scenario. The t-distribution has heavier tails compared to the standard normal distribution, which accounts for the additional uncertainty introduced by having to estimate the population standard deviation from the sample data.
The standard normal (Z) distribution is incorrect because its use requires either a known population standard deviation or a large sample size (n ≥ 30), neither of which applies here. The Chi-squared distribution is incorrect as it is primarily used for tests involving categorical data (like goodness-of-fit or independence) or for making inferences about population variance, not for creating a confidence interval for the mean. The Binomial distribution is also incorrect because it models discrete data representing the number of successes in a series of independent trials, whereas ROAS improvement is a continuous variable.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is the t-distribution used for small sample sizes instead of the Z-distribution?
Open an interactive chat with Bash
What does 'degrees of freedom' mean in the context of the t-distribution?
Open an interactive chat with Bash
When would the Chi-squared or Binomial distributions be appropriate instead of the t-distribution?