Carol has data on the causes of violent outbursts among kindergarten children. She used a "shot-gun" approach to hypothesis testing by asking the computer to produce p-values for 300 quickly chosen hypothesis tests. She shared her glee with fellow researchers, noting that she had found 15 statistically significant results. She went so far as to brag that she had made some discoveries that defied all previously held notions about the wild behavior of this population of terrifying tots. She used the .05 level of significance. Explain why she actually found nothing.

.05 * 300 = 15

This means you would expect 15 tests to be significant solely by chance.

Carol's approach to hypothesis testing using a "shot-gun" approach is not a reliable or valid method. By conducting a large number of hypothesis tests without a strong theoretical basis or proper control for multiple testing, she is more likely to come across false-positive results or randomly significant results.

In hypothesis testing, the level of significance (typically denoted as α) is the probability of rejecting the null hypothesis (assuming it is true). The commonly used significance level is .05, which means that if the p-value of a test is less than .05, it is considered statistically significant and the null hypothesis is rejected.

However, when conducting multiple hypothesis tests, there is an increased chance of finding statistically significant results purely by chance. This is known as the problem of multiple comparisons. The more tests you conduct, the higher the probability of finding significant results, even if the null hypothesis is true for all of them.

In Carol's case, by conducting 300 hypothesis tests and considering p-values less than .05 as significant, it is highly likely that some of the statistically significant results she obtained are simply due to random chance. In fact, if all 300 tests were genuinely testing null hypotheses, we would expect around 5% of the tests, or approximately 15 tests, to have p-values less than .05 by chance alone.

Therefore, her claim of making groundbreaking discoveries that defy previously held notions about kindergarten children's behavior is unfounded. Without a proper methodology controlling for multiple testing and conducting hypothesis tests based on clear hypotheses and theoretical foundations, her findings are more likely to be false positives and cannot be trusted.

To avoid the problems of multiple comparisons, researchers typically use techniques such as Bonferroni correction or false discovery rate control to adjust the significance level or p-values accordingly. These techniques help reduce the likelihood of false positives and provide a more rigorous interpretation of the results.

In conclusion, Carol's approach lacks validity and her claimed discoveries are likely to be spurious. It is important to carefully plan hypothesis tests, use appropriate statistical methods, and control for multiple testing when conducting research to ensure reliable and meaningful results.