A data scientist at a financial firm is analyzing daily transaction values. The underlying distribution of individual transaction values is known to be heavily skewed, but it has a finite mean and variance. To perform risk analysis, the scientist repeatedly draws random samples of 500 transactions and calculates the mean for each sample. They observe that the distribution of these sample means closely approximates a normal distribution. Which theorem provides the theoretical foundation for this observation?
The correct answer is the Central Limit Theorem (CLT). The CLT states that the sampling distribution of the mean of a sufficiently large number of samples drawn from a population with a finite level of variance will approximate a normal distribution, regardless of the underlying population's distribution. In this scenario, the population distribution is skewed, but the sample size (n=500) is large, and the population has a finite variance. Therefore, the CLT explains why the distribution of the sample means becomes approximately normal.
The Law of Large Numbers (LLN) is incorrect because it states that as a sample size grows, its mean will get closer to the average of the whole population. It describes the convergence of the sample mean to the population mean, not the shape of the sampling distribution.
Bayes' Rule is incorrect. It is a mathematical formula for determining conditional probability, allowing one to update the probability of a hypothesis based on new evidence. It is not related to the shape of sampling distributions.
The principle of homoskedasticity is an assumption in linear regression which states that the variance of the errors (residuals) is constant across all levels of the independent variables. It is not relevant to describing the distribution of sample means from a single population.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is the Central Limit Theorem (CLT)?
Open an interactive chat with Bash
Why does the sample size need to be large for the CLT to apply?
Open an interactive chat with Bash
How is the CLT different from the Law of Large Numbers (LLN)?