Statistical Test Example

The exercise below provides practice in statistical test selection.

The examples below illustrate how a researcher might go about selecting a statistical test. They also show that based on the study design and the distribution of the data, the best statistical test for analysis can change.

Study Design

Groups

Time points

Data Distribution

Variable type

Test

Pretest/Posttest: Studying an eight-week tutoring component at an after-school program. Assessing student satisfaction of 40 participants. Comparing the same students’ satisfaction using a pretest before the tutoring begins and a posttest after the tutoring component ends.

One

Two

Normal

Ordinal: 1=Very satisfied, 2=Satisfied, 3=Not at all satisfied

Paired T-Test

Same as above.

One

Two

Non-normal

Same as above

Wilcoxon Sign-Rank Test

Pretest, Posttest, and Control Group: Comparing the satisfaction of two groups of students in different after-school programs. Each group has 25 participants. Comparing the satisfaction scores using a pretest before the intervention and a posttest after the intervention.

Two

Two

Non-Normal

Same as above

Wilcoxon Rank-Sum Test

Pretest, Posttest: Assessing weight loss after a nutrition intervention among the one group of 50 students who receive the intervention. Would like to determine if there is a relationship between participation in the intervention and weight loss. Weight is measured before and after the intervention.

One

Two

Normal

Continuous (ratio): weight in pounds

Paired T-Test

These statistics can show whether the results and relationships observed are real or just due to chance. Associated with each statistic is a p-value that shows whether something is statistically significant. If someone says the test was statistically significant, they mean it is unlikely that the results are due to random chance.

For many statistical tests, the results are considered significant if the p-value is 0.05 or less. This is based on a level of 95% confidence. This cut-off or criteria of 0.05 was set during the early development of statistical methods and is somewhat arbitrary. This is why sometimes a result may be statistically significant without necessarily being meaningful.

It is important to understand also that a statistically significant result does not necessarily have a meaningful effect. Sometimes small effects can be statistically significant simply because a very large number of people have participated in the research. In order to account for this, the effect size is often calculated in research.

The effect size is a measure of magnitude or strength (versus the p-value of statistical tests, which tells us if a relationship is due to chance or not), and it tells researchers if the results are meaningful or just due to the large number of people who participated in the research.