# Difference between aov() and ezANOVA when using a subset of DataFrame in repeated measures ANOVA

I ran into something I do not understand when conducting a repeated measures ANOVA.

Short description: I'm using a dataset included in the "ez" package. When I conduct a repeated measures ANOVA on the full dataset, the results of ezANOVA and aov() are equivalent. However, once I take only a subset (in this case: only reaction times for trials without error) the results with ezANOVA and aov() differ.

The longer story: The dataset contains a subject column (subnum), two within subject factors (cue, flank) and the dependent variable (rt). The dataset can be loaded via

``````library('ez')
data(ANT)
df = ANT
df\$cue <- as.factor(df\$cue)
df\$flank <- as.factor(df\$flank)
df\$subnum <- as.factor(df\$subnum)

subnum group block trial cue  flank  location   direction   rt  error
1   Treatment   1   1   None    Neutral     up  left    398.6773    0
1   Treatment   1   2   Center  Neutral     up  left    389.1822    0
1   Treatment   1   3   Double  Neutral     up  left    333.2186    0
1   Treatment   1   4   Spatial Neutral     up  left    419.7640    0
1   Treatment   1   5   None    Congruent   up  left    446.4754    0
1   Treatment   1   6   Center  Congruent   up  left    338.9766    0
1   Treatment   1   7   Double  Congruent   up  left    399.3715    0
``````

Now when I perform a repeated measures ANOVA on the full dataset, and use both ezANOVA and aov(),the results are the same.

``````ezANOVA(
data=df,
dv=rt,
wid=subnum,
within = .(cue, flank),
)

\$ANOVA
Effect  DFn DFd F   p   p<.05   ges
2   cue     3   57  540.862407  7.988172e-42    *   0.87793881
3   flank   2   38  1066.037656     4.196305e-34    *   0.91110583
4   cue:flank   6   114     4.357093    5.356773e-04    *   0.09416982
\$`Mauchly's Test for Sphericity`
Effect  W   p   p<.05
2   cue     0.8431739   0.69690404
3   flank   0.7999302   0.13411237
4   cue:flank   0.1378186   0.03419366  *
\$`Sphericity Corrections`
Effect  GGe p[GG]   p[GG]<.05   HFe p[HF]   p[HF]<.05
2   cue     0.9016877   6.126025e-38    *   1.0657965   7.988172e-42    *
3   flank   0.8332849   8.590878e-29    *   0.9037852   4.869100e-31    *
4   cue:flank   0.5956263   4.652864e-03    *   0.7506166   2.015937e-03    *
``````

For aov():

``````mod123 <- aov(rt ~ (cue*flank) + Error(subnum/(cue*flank)), data = df)
summary(mod123)

Error: subnum
Df Sum Sq Mean Sq F value Pr(>F)
Residuals 19  85489    4499

Error: subnum:cue
Df  Sum Sq Mean Sq F value Pr(>F)
cue        3 5523668 1841223   540.9 <2e-16 ***
Residuals 57  194041    3404
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subnum:flank
Df  Sum Sq Mean Sq F value Pr(>F)
flank      2 7871119 3935559    1066 <2e-16 ***
Residuals 38  140287    3692
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subnum:cue:flank
Df Sum Sq Mean Sq F value   Pr(>F)
cue:flank   6  79837   13306   4.357 0.000536 ***
Residuals 114 348147    3054
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: Within
Df   Sum Sq Mean Sq F value Pr(>F)
Residuals 5520 14422221    2613
``````

So until now, everything works fine. But if I select only those observations where error equals zero, the results between both approaches differ:

``````ezANOVA(
data=df[df\$error==0,],
dv=rt,
wid=subnum,
within = .(cue, flank),
)

\$ANOVA
Effect DFn DFd          F            p p<.05        ges
2       cue   3  57 477.564650 2.435084e-40     * 0.86387868
3     flank   2  38 958.640865 3.040261e-33     * 0.90297213
4 cue:flank   6 114   4.047785 1.026734e-03     * 0.08633287

\$`Mauchly's Test for Sphericity`
Effect         W          p p<.05
2       cue 0.8670854 0.77271988
3     flank 0.9088146 0.42293876
4 cue:flank 0.1506008 0.04917243     *

\$`Sphericity Corrections`
Effect       GGe        p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
2       cue 0.9165014 3.647676e-37         * 1.086943 2.435084e-40         *
3     flank 0.9164345 1.182224e-30         * 1.009411 3.040261e-33         *
4 cue:flank 0.6261487 6.059761e-03         * 0.799682 2.641207e-03         *

mod123 <- aov(rt ~ (cue*flank) + Error(subnum/(cue*flank)), data = df[df\$error==0,])
summary(mod123)

Error: subnum
Df Sum Sq Mean Sq F value Pr(>F)
cue        3  22044    7348   2.037  0.187
flank      2  31873   15936   4.418  0.051 .
cue:flank  6  12677    2113   0.586  0.735
Residuals  8  28858    3607
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subnum:cue
Df  Sum Sq Mean Sq F value Pr(>F)
cue        3 4818840 1606280 445.910 <2e-16 ***
flank      2    3512    1756   0.487  0.617
cue:flank  6   24626    4104   1.139  0.354
Residuals 49  176510    3602
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subnum:flank
Df  Sum Sq Mean Sq F value Pr(>F)
flank      2 7195298 3597649 936.928 <2e-16 ***
cue:flank  6   17408    2901   0.756   0.61
Residuals 32  122875    3840
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: subnum:cue:flank
Df Sum Sq Mean Sq F value   Pr(>F)
cue:flank   6  73202   12200   4.096 0.000928 ***
Residuals 114 339584    2979
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: Within
Df   Sum Sq Mean Sq F value Pr(>F)
Residuals 4951 12871045    2600
``````

What am I missing here? Many thanks in advance.