# statistics

posted by on .

On Wednesday afternoons last year a teacher regularly took her class out to study traffic flow through a small village. Her studies established that lorries pass through the village at a rate of 1.5 per 5-minute period on Wednesday afternoons.

This year she had to take her class out on Thursday morning. In a half-hour period the class observed 15 lorries passing through the village.

Stating your hypotheses clearly and using a 5% level of significance, test whether or not there is evidence that the rate of lorries passing through on Thursday mornings is greater than the rate on Wednesday afternoons

• statistics - ,

The hypothesis is that the rate of lorries passing through the village on Thursday mornings is 1.5 every five minutes - or equivalently, 9 per half hour.

We've just observed 15 lorries passing through the village in a half-hour period, and we want to know whether it is credible that the rate is really 9 per half-hour in the light of that evidence.

If the rate of lorries passing through the village is really 9 per half-hour, then the probability of getting 0, 1, 2, 3 etc lorries passing through within a half-hour period is given by the Poisson distribution with parameter m = 9. Then the probability of getting exactly k lorries within a half-hour period will be { (m^k) x exp(-m) / k! }. If you have access to a spreadsheet you can work out the probabilities of getting any specific value of k - for example, the probabilities of getting k = 0, 1, 2, 3 and 4 are 0.0001, 0.0011, 0.005, 0.015 and 0.0337 respectively, and it'll take you just a few minutes to tot up all the probabilities upto and including 14 - in which case I reckon you'll get 0.9545. For your observed result of 15 lorries to be statistically significant, the total probability of observing 15 lorries OR MORE (don't forget that bit) would need to be less than 0.05. But the figure you want is just one minus the figure I've just calculated above - which is 0.0455. So that's a significant result (but only just).

If you DON'T have access to a spreadsheet, you'd need to use the Normal approximation to the Poisson distribution to get at the probability you want. The variance of the Possion distribution is equal to the mean, so under your hypothesis both the mean and the variance are 9. So how many standard deviations above the mean is 15 lorries? Your data is discrete however, not continuous, so I suspect you ought to calculate the probability of getting more than 14.5 lorries rather than more than 15 lorries here. Your test statistic then is Z = (14.5 - 9) / sqrt(9) = 1.833, and if you look that up in a set of Normal tables, you'll find that the area to the right of 1.833 is about 1 - 0.968 = 0.032, which again is a significant result.

Either way, you're concluding that at the 5% level of significance, your evidence is just sufficient to allow you to conclude that the rate on Thursday mornings is greater than the rate on Wednesday afternoons. (Note that this is a one-sided test: you're only testing the alternative hypothesis that the number of lorries passing through per half-hour is GREATER than 9. That's why you don't have to worry about the corresponding lower tail of the distribution - it's only the upper tail you need.)