Sample Size, Variance and Standard Deviation

A student organization plans to ask 100 randomly selected students how much they spent
on textbooks last semester. You argue for a sample of 900 students instead of 100. You
know the standard deviation of the sample mean x of the amounts spent will differ. How
will the standard deviation of the larger sample compare with that of the smaller sample?

To understand how the standard deviation of the larger sample will compare with that of the smaller sample, we need to consider the relationship between the sample size, variance, and standard deviation.

Variance is a measure of how spread out the data values are from the mean. It quantifies the average squared difference between each data point and the mean. A larger variance indicates a greater spread of the data.

The standard deviation, on the other hand, is the square root of the variance. It represents the average distance between each data point and the mean. It measures the spread of the data in the same units as the original data.

When it comes to sampling, increasing the sample size can have an impact on the standard deviation of the sample mean. The standard deviation of the sample mean (often denoted as σx) is equal to the standard deviation of the population (σ) divided by the square root of the sample size (n).

σx = σ / √n

In this case, you are arguing for a larger sample size of 900 students instead of 100. Let's compare the standard deviations of these two sample sizes using the formula above.

For the smaller sample of 100 students:
σx_small = σ / √100 = σ / 10

For the larger sample of 900 students:
σx_large = σ / √900 = σ / 30

From these equations, we can see that by increasing the sample size from 100 to 900, the standard deviation of the sample mean decreases. It becomes one-third of its original value because √900 is three times larger than √100.

Therefore, the standard deviation of the larger sample (σx_large) will be smaller compared to the standard deviation of the smaller sample (σx_small). This means that the larger sample will provide a more precise estimate of the mean expenditure on textbooks because the data points are more tightly clustered around the true population mean.