# Difference between aov() and ezANOVA when using a subset of DataFrame in repeated measures ANOVA

I ran into something I do not understand when conducting a repeated measures ANOVA.

Short description: I'm using a dataset included in the "ez" package. When I conduct a repeated measures ANOVA on the full dataset, the results of ezANOVA and aov() are equivalent. However, once I take only a subset (in this case: only reaction times for trials without error) the results with ezANOVA and aov() differ.

The longer story: The dataset contains a subject column (subnum), two within subject factors (cue, flank) and the dependent variable (rt). The dataset can be loaded via

```
library('ez')
data(ANT)
df = ANT
df$cue <- as.factor(df$cue)
df$flank <- as.factor(df$flank)
df$subnum <- as.factor(df$subnum)
subnum group block trial cue flank location direction rt error
1 Treatment 1 1 None Neutral up left 398.6773 0
1 Treatment 1 2 Center Neutral up left 389.1822 0
1 Treatment 1 3 Double Neutral up left 333.2186 0
1 Treatment 1 4 Spatial Neutral up left 419.7640 0
1 Treatment 1 5 None Congruent up left 446.4754 0
1 Treatment 1 6 Center Congruent up left 338.9766 0
1 Treatment 1 7 Double Congruent up left 399.3715 0
```

Now when I perform a repeated measures ANOVA on the full dataset, and use both ezANOVA and aov(),the results are the same.

```
ezANOVA(
data=df,
dv=rt,
wid=subnum,
within = .(cue, flank),
)
$ANOVA
Effect DFn DFd F p p<.05 ges
2 cue 3 57 540.862407 7.988172e-42 * 0.87793881
3 flank 2 38 1066.037656 4.196305e-34 * 0.91110583
4 cue:flank 6 114 4.357093 5.356773e-04 * 0.09416982
$`Mauchly's Test for Sphericity`
Effect W p p<.05
2 cue 0.8431739 0.69690404
3 flank 0.7999302 0.13411237
4 cue:flank 0.1378186 0.03419366 *
$`Sphericity Corrections`
Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
2 cue 0.9016877 6.126025e-38 * 1.0657965 7.988172e-42 *
3 flank 0.8332849 8.590878e-29 * 0.9037852 4.869100e-31 *
4 cue:flank 0.5956263 4.652864e-03 * 0.7506166 2.015937e-03 *
```

For aov():

```
mod123 <- aov(rt ~ (cue*flank) + Error(subnum/(cue*flank)), data = df)
summary(mod123)
Error: subnum
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 19 85489 4499
Error: subnum:cue
Df Sum Sq Mean Sq F value Pr(>F)
cue 3 5523668 1841223 540.9 <2e-16 ***
Residuals 57 194041 3404
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: subnum:flank
Df Sum Sq Mean Sq F value Pr(>F)
flank 2 7871119 3935559 1066 <2e-16 ***
Residuals 38 140287 3692
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: subnum:cue:flank
Df Sum Sq Mean Sq F value Pr(>F)
cue:flank 6 79837 13306 4.357 0.000536 ***
Residuals 114 348147 3054
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 5520 14422221 2613
```

So until now, everything works fine. But if I select only those observations where error equals zero, the results between both approaches differ:

```
ezANOVA(
data=df[df$error==0,],
dv=rt,
wid=subnum,
within = .(cue, flank),
)
$ANOVA
Effect DFn DFd F p p<.05 ges
2 cue 3 57 477.564650 2.435084e-40 * 0.86387868
3 flank 2 38 958.640865 3.040261e-33 * 0.90297213
4 cue:flank 6 114 4.047785 1.026734e-03 * 0.08633287
$`Mauchly's Test for Sphericity`
Effect W p p<.05
2 cue 0.8670854 0.77271988
3 flank 0.9088146 0.42293876
4 cue:flank 0.1506008 0.04917243 *
$`Sphericity Corrections`
Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
2 cue 0.9165014 3.647676e-37 * 1.086943 2.435084e-40 *
3 flank 0.9164345 1.182224e-30 * 1.009411 3.040261e-33 *
4 cue:flank 0.6261487 6.059761e-03 * 0.799682 2.641207e-03 *
mod123 <- aov(rt ~ (cue*flank) + Error(subnum/(cue*flank)), data = df[df$error==0,])
summary(mod123)
Error: subnum
Df Sum Sq Mean Sq F value Pr(>F)
cue 3 22044 7348 2.037 0.187
flank 2 31873 15936 4.418 0.051 .
cue:flank 6 12677 2113 0.586 0.735
Residuals 8 28858 3607
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: subnum:cue
Df Sum Sq Mean Sq F value Pr(>F)
cue 3 4818840 1606280 445.910 <2e-16 ***
flank 2 3512 1756 0.487 0.617
cue:flank 6 24626 4104 1.139 0.354
Residuals 49 176510 3602
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: subnum:flank
Df Sum Sq Mean Sq F value Pr(>F)
flank 2 7195298 3597649 936.928 <2e-16 ***
cue:flank 6 17408 2901 0.756 0.61
Residuals 32 122875 3840
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: subnum:cue:flank
Df Sum Sq Mean Sq F value Pr(>F)
cue:flank 6 73202 12200 4.096 0.000928 ***
Residuals 114 339584 2979
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: Within
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 4951 12871045 2600
```

What am I missing here? Many thanks in advance.