The exercise below provides practice in statistical test selection.
The examples below illustrate how a researcher might go about selecting a statistical test. They also show that based on the study design and the distribution of the data, the best statistical test for analysis can change.
Study Design |
Groups |
Time points |
Data Distribution |
Variable type |
Test |
Pretest/Posttest: Studying an eight-week tutoring component at an after-school program. Assessing student satisfaction of 40 participants. Comparing the same students’ satisfaction using a pretest before the tutoring begins and a posttest after the tutoring component ends. |
One |
Two |
Normal |
Ordinal: 1=Very satisfied, 2=Satisfied, 3=Not at all satisfied |
Paired T-Test |
Same as above. |
One |
Two |
Non-normal |
Same as above |
Wilcoxon Sign-Rank Test |
Pretest, Posttest, and Control Group: Comparing the satisfaction of two groups of students in different after-school programs. Each group has 25 participants. Comparing the satisfaction scores using a pretest before the intervention and a posttest after the intervention. |
Two |
Two |
Non-Normal |
Same as above |
Wilcoxon Rank-Sum Test |
Pretest, Posttest: Assessing weight loss after a nutrition intervention among the one group of 50 students who receive the intervention. Would like to determine if there is a relationship between participation in the intervention and weight loss. Weight is measured before and after the intervention. |
One |
Two |
Normal |
Continuous (ratio): weight in pounds |
Paired T-Test |
These statistics can show whether the results and relationships observed are real or just due to chance. Associated with each statistic is a p-value that shows whether something is statistically significant. If someone says the test was statistically significant, they mean it is unlikely that the results are due to random chance.
For many statistical tests, the results are considered significant if the p-value is 0.05 or less. This is based on a level of 95% confidence. This cut-off or criteria of 0.05 was set during the early development of statistical methods and is somewhat arbitrary. This is why sometimes a result may be statistically significant without necessarily being meaningful.
It is important to understand also that a statistically significant result does not necessarily have a meaningful effect. Sometimes small effects can be statistically significant simply because a very large number of people have participated in the research. In order to account for this, the effect size is often calculated in research.
The effect size is a measure of magnitude or strength (versus the p-value of statistical tests, which tells us if a relationship is due to chance or not), and it tells researchers if the results are meaningful or just due to the large number of people who participated in the research.