Overview
The within-subjects (or repeated measures or paired-samples) t-test is a very common statistical method used to compare mean differences between two dependent groups. This is different than the between-subjects t-test because individuals are in both of the two comparison groups. For example, math achievement of students before and after an intervention. If the same individuals are not in both groups, you need to do a between-subjects t-test. For example, math achievement scores between men and women. In this tutorial, we’ll be looking at pre- and post- oral health data from the University of Michigan (click here for the data set).
Rather than bucketing people into group A and group B and looking for differences between the groups (like a between-subjects t-test), this is computing a difference score for each individual between their pre- and post- measure, and then testing whether the average across the difference scores is different than zero. If it is, we know that there are differences between the pre- and post- groups. This is statistically more powerful than a between-subjects test then, because each individual is treated as their own baseline or control. Very cool!
When to use the within-subjects t-test
A continuous dependent (Y) variable and a categorical unpaired, independent, (X) variable. If you’re dealing with 1 X variable with only 2 levels, you would be better suited to run a t-test. If you’re dealing with 1 X variable with more than two 2 levels, you will need to run a within-subjects ANOVA instead. If you’re dealing with paired data, you would want to look at a within-subjects t-test instead.
Within-subjects t-test assumptions
- the dependent variable is continuous (interval or ratio).
- Outcome scores are related to/correlated with/dependent on each other in some way. This is really the defining feature of a between or within-subject t-test. A typical example includes pre- and post- data of the same people (because outcome scores are dependent) on who the subject is. Put another way, the same person’s pre- score will highly relate to their post- score, since, after all, they’re the same person between time 1 and 2.
- The difference scores are approximately normally distributed. This is an interesting assumption because it reveals a lot about what the within-subjects t-test is actually doing. This is not a very hard-lined assumption, though.
Running a within-subjects t-test
Click Compare Means
Click Paired-Samples T-Test…
In the next table, move the pre- and post-scores into the paired variables section, like so. In this case, TOTALCIN is the before measure and TOTALCW6 is the post (after 6 weeks) score of oral health.
Click OK
The output is then given below…
Interpreting the within-subjects t-test output:
- A. Mean – the mean of the paired differences. This can be a little confusing to read because it’s calculated here as the BEFORE – AFTER score. So, if you take BEFORE-AFTER and get a negative number, it means the after score is greater. Kind of confusing. If you want to know which means (BEFORE or AFTER) is greater, it’s often easier to just look at the first table of the output. TOTALCW6 = 9.48, and TOTALCIN = 6.48, meaning that the AFTER score is greater than the before score. But, we need to look at the Sig (2-tailed) to know if this difference is meaningful or if it’s just noise.
- B. Std. Deviation – the average deviation of paired differences.
- C. 95% Confidence Interval of the Difference – this is the confidence region. Put simply, there’s a 95% chance that the true average difference in before and after scores is within this range (-4.59 to -1.41). Because both of these values are negative, the results must be statistically significant. Think about it! If we are more than 95% certain the true difference does not contain 0.00, it’s the same as saying there’s a less than 5% chance (p = .05) the true difference is 0.00, or non-significant.
- D. t – the test statistic. It’s literally the number of standard errors that the sample mean is from the theoretically null value.
- E. df- the degrees of freedom. In a paired-sample t-test this is the number of pairs – 1.
- F. Sig. (2-tailed). – this is the real meat of the analysis and the thing we care about the most. This tells us if the difference in pre- and post scores is actually significant. In simplest terms, p = .001 means that there’s a .1% chance that the means are the same, which is very significant!
Reporting the results
A within-between subjects t-test indicated that oral health care was significantly lower in the pre condition than in the post condition (M difference = -3.00, SD difference = 3.680), t(22) = -3.909, p = .001. Thus, participants’ oral health increased across time.
short onesies for adults
May I just say what a relief to discover somebody who genuinely knows what they’re discussing on the net. You definitely know how to bring a problem to light and make it important. More and more people really need to look at this and understand this s…
Hello!
Just wanted to correct an error:
“ The within-subjects (or repeated measures or paired-samples) t-test is a very common statistical method used compare mean differences between two independent groups.”
It should be between two DEPENDENT groups, not independent 🙂
Thanks for the keen eye!