Anova

Anova is capable of performing multiple factor between, within, and mixed analyses of variance. Within and mixed analyses of variance provide corrections for violations of sphericity (Huynh-Feldt, Greenhouse-Geisser, Box). The Anova objects also have a convenience method that aids in creating interaction plots once an analysis has been run.

If you would like to run multiple factor ANOVAs but don’t want the hassle of coding your analyses you might want to check out .. _GUANO: http://sourceforge.net/projects/guano/ GUANO. It uses pyvttbl and Anova behind the scenes but provides a point-and-click interface.

Running an ANOVA

The recommended way to run an ANOVA is to first load data into a DataFrame and use DataFrame.anova to run the analysis. Calling DataFrame.anova will return an Anova object.

Example Within-Subjects ANOVA

>>> from __future__ import print_function
>>>
>>> df = DataFrame()
>>> fname = 'error~subjectXtimeofdayXcourseXmodel.csv'
>>> df.read_tbl(fname)
>>> aov = df.anova('ERROR',
                   wfactors=['TIMEOFDAY','COURSE','MODEL'])
>>> print(type(aov))
<class 'anova.Anova'>

Example Between-Subjects ANOVA

>>> df=DataFrame()
>>> fname='words~ageXcondition.csv'
>>> df.read_tbl(fname)
>>> aov=Anova()
>>> aov.run(df, 'WORDS', bfactors=['AGE','CONDITION'])

Notice in this case we are not using DataFrame.anova. We are initializing an Anova object and passing a DataFrame instance to it.

Example Mixed-Subjects ANOVA

To run a mixed ANOVA we just have to specify both wfactors and bfactors.

Note

Both wfactors and bfactors must be iterable (lists of labels).

>>> df = DataFrame()
>>> fname = 'suppression~subjectXgroupXcycleXphase.csv'
>>> df.read_tbl(fname)
>>> aov = df.anova('SUPPRESSION',
                   wfactors=['CYCLE','PHASE'],
                   bfactors=['GROUP'])

Specifying a Data Transformation

The analysis is also capable of applying log10, square-root, arc-sine, reciprocal, an Windsoring data transforms. These are specified with the transform keyword.

transform keyword

Transform

Comments

‘’

X

default

‘log’ or ‘log10’

numpy.log10(X)

‘reciprocal’ or ‘inverse’

1/X

‘square-root’ or ‘sqrt’

numpy.sqrt(X)

‘arcsine’ or ‘arcsin’

numpy.arcsin(X)

‘windsor01’

anova.windsor(X, 1)

1% trim

‘windsor05’

anova.windsor(X, 5)

5% trim

‘windsor10’

anova.windsor(X, 10)

10% trim

Printing a Summary

Once an analysis is ran the results can be viewed by printing the object. By default estimated marginal means are also provided.

>>> from __future__ import print_function
>>>
>>> df = DataFrame()
>>> fname = 'error~subjectXtimeofdayXcourseXmodel.csv'
>>> df.read_tbl(fname)
>>> aov = df.anova('ERROR',
                   wfactors=['TIMEOFDAY','COURSE','MODEL'])
>>> print(aov)
ERROR ~ TIMEOFDAY * COURSE * MODEL

TESTS OF WITHIN SUBJECTS EFFECTS

Measure: ERROR
     Source                              Type III    eps     df       MS         F         Sig.      et2_G   Obs.    SE     95% CI    lambda    Obs.
                                            SS                                                                                                  Power
=====================================================================================================================================================
TIMEOFDAY           Sphericity Assumed    140.167       -       1   140.167    120.143       0.008   3.391     27   0.456    0.894   1621.929       1
                    Greenhouse-Geisser    140.167       1       1   140.167    120.143       0.008   3.391     27   0.456    0.894   1621.929       1
                    Huynh-Feldt           140.167       1       1   140.167    120.143       0.008   3.391     27   0.456    0.894   1621.929       1
                    Box                   140.167       1       1   140.167    120.143       0.008   3.391     27   0.456    0.894   1621.929       1
-----------------------------------------------------------------------------------------------------------------------------------------------------
Error(TIMEOFDAY)    Sphericity Assumed      2.333       -       2     1.167
                    Greenhouse-Geisser      2.333       1       2     1.167
                    Huynh-Feldt             2.333       1       2     1.167
                    Box                     2.333       1       2     1.167
-----------------------------------------------------------------------------------------------------------------------------------------------------
COURSE              Sphericity Assumed     56.778       -       2    28.389   1022.000   3.815e-06   1.374     18   0.056    0.109   9198.000       1
                    Greenhouse-Geisser     56.778   0.501   1.002    56.667   1022.000   9.664e-04   1.374     18   0.056    0.109   9198.000       1
                    Huynh-Feldt            56.778   0.504   1.008    56.336   1022.000   9.349e-04   1.374     18   0.056    0.109   9198.000       1
                    Box                    56.778   0.500       1    56.778   1022.000   9.770e-04   1.374     18   0.056    0.109   9198.000       1
-----------------------------------------------------------------------------------------------------------------------------------------------------
Error(COURSE)       Sphericity Assumed      0.111       -       4     0.028
                    Greenhouse-Geisser      0.111   0.501   2.004     0.055
                    Huynh-Feldt             0.111   0.504   2.016     0.055
                    Box                     0.111   0.500       2     0.056
-----------------------------------------------------------------------------------------------------------------------------------------------------
MODEL               Sphericity Assumed     51.444       -       2    25.722     92.600   4.470e-04   1.245     18   0.176    0.345    833.400       1
                    Greenhouse-Geisser     51.444   0.507   1.013    50.770     92.600       0.010   1.245     18   0.176    0.345    833.400   1.000
                    Huynh-Feldt            51.444   0.527   1.054    48.817     92.600       0.009   1.245     18   0.176    0.345    833.400   1.000
                    Box                    51.444   0.500       1    51.444     92.600       0.011   1.245     18   0.176    0.345    833.400   1.000
-----------------------------------------------------------------------------------------------------------------------------------------------------
Error(MODEL)        Sphericity Assumed      1.111       -       4     0.278
                    Greenhouse-Geisser      1.111   0.507   2.027     0.548
                    Huynh-Feldt             1.111   0.527   2.108     0.527
                    Box                     1.111   0.500       2     0.556
-----------------------------------------------------------------------------------------------------------------------------------------------------
TIMEOFDAY *         Sphericity Assumed      5.444       -       2     2.722      2.085       0.240   0.132      9   0.540    1.057      9.383   0.446
COURSE              Greenhouse-Geisser      5.444   0.814   1.628     3.345      2.085       0.255   0.132      9   0.540    1.057      9.383   0.373
                    Huynh-Feldt             5.444       1       2     2.722      2.085       0.240   0.132      9   0.540    1.057      9.383   0.446
                    Box                     5.444   0.500       1     5.444      2.085       0.286   0.132      9   0.540    1.057      9.383   0.244
-----------------------------------------------------------------------------------------------------------------------------------------------------
Error(TIMEOFDAY *   Sphericity Assumed      5.222       -       4     1.306
COURSE)             Greenhouse-Geisser      5.222   0.814   3.255     1.604
                    Huynh-Feldt             5.222       1       4     1.306
                    Box                     5.222   0.500       2     2.611
-----------------------------------------------------------------------------------------------------------------------------------------------------
TIMEOFDAY *         Sphericity Assumed     16.778       -       2     8.389     37.750       0.003   0.406      9   0.223    0.436    169.875   1.000
MODEL               Greenhouse-Geisser     16.778   0.540   1.079    15.545     37.750       0.021   0.406      9   0.223    0.436    169.875   0.993
                    Huynh-Feldt            16.778   0.571   1.142    14.697     37.750       0.018   0.406      9   0.223    0.436    169.875   0.996
                    Box                    16.778   0.500       1    16.778     37.750       0.025   0.406      9   0.223    0.436    169.875   0.985
-----------------------------------------------------------------------------------------------------------------------------------------------------
Error(TIMEOFDAY *   Sphericity Assumed      0.889       -       4     0.222
MODEL)              Greenhouse-Geisser      0.889   0.540   2.159     0.412
                    Huynh-Feldt             0.889   0.571   2.283     0.389
                    Box                     0.889   0.500       2     0.444
-----------------------------------------------------------------------------------------------------------------------------------------------------
COURSE *            Sphericity Assumed      8.778       -       4     2.194      3.762       0.052   0.212      6   0.367    0.719     11.286   0.504
MODEL               Greenhouse-Geisser      8.778   0.354   1.415     6.204      3.762       0.157   0.212      6   0.367    0.719     11.286   0.223
                    Huynh-Feldt             8.778   0.354   1.415     6.204      3.762       0.157   0.212      6   0.367    0.719     11.286   0.223
                    Box                     8.778   0.500       2     4.389      3.762       0.120   0.212      6   0.367    0.719     11.286   0.292
-----------------------------------------------------------------------------------------------------------------------------------------------------
Error(COURSE *      Sphericity Assumed      4.667       -       8     0.583
MODEL)              Greenhouse-Geisser      4.667   0.354   2.830     1.649
                    Huynh-Feldt             4.667   0.354   2.830     1.649
                    Box                     4.667   0.500       4     1.167
-----------------------------------------------------------------------------------------------------------------------------------------------------
TIMEOFDAY *         Sphericity Assumed      2.778       -       4     0.694      1.923       0.200   0.067      3   0.408    0.800      2.885   0.152
COURSE *            Greenhouse-Geisser      2.778   0.290   1.159     2.397      1.923       0.293   0.067      3   0.408    0.800      2.885   0.087
MODEL               Huynh-Feldt             2.778   0.290   1.159     2.397      1.923       0.293   0.067      3   0.408    0.800      2.885   0.087
                    Box                     2.778   0.500       2     1.389      1.923       0.260   0.067      3   0.408    0.800      2.885   0.109
-----------------------------------------------------------------------------------------------------------------------------------------------------
Error(TIMEOFDAY *   Sphericity Assumed      2.889       -       8     0.361
COURSE *            Greenhouse-Geisser      2.889   0.290   2.318     1.246
MODEL)              Huynh-Feldt             2.889   0.290   2.318     1.246
                    Box                     2.889   0.500       4     0.722

TABLES OF ESTIMATED MARGINAL MEANS

Estimated Marginal Means for TIMEOFDAY
TIMEOFDAY   Mean    Std. Error   95% Lower Bound   95% Upper Bound
==================================================================
T1          5.778        0.457             4.882             6.674
T2          2.556        0.229             2.108             3.003

Estimated Marginal Means for COURSE
COURSE   Mean    Std. Error   95% Lower Bound   95% Upper Bound
===============================================================
C1       5.222        0.608             4.031             6.414
C2       4.500        0.562             3.399             5.601
C3       2.778        0.432             1.931             3.625

Estimated Marginal Means for MODEL
MODEL   Mean    Std. Error   95% Lower Bound   95% Upper Bound
==============================================================
M1      5.333        0.686             3.989             6.678
M2      4.222        0.558             3.129             5.315
M3      2.944        0.328             2.301             3.588

Estimated Marginal Means for TIMEOFDAY * COURSE
TIMEOFDAY   COURSE   Mean    Std. Error   95% Lower Bound   95% Upper Bound
===========================================================================
T1          C1       7.222        0.641             5.966             8.478
T1          C2       6.111        0.790             4.564             7.659
T1          C3           4        0.577             2.868             5.132
T2          C1       3.222        0.401             2.437             4.007
T2          C2       2.889        0.261             2.378             3.400
T2          C3       1.556        0.294             0.979             2.132

Estimated Marginal Means for TIMEOFDAY * MODEL
TIMEOFDAY   MODEL   Mean    Std. Error   95% Lower Bound   95% Upper Bound
==========================================================================
T1          M1      7.444        0.835             5.807             9.081
T1          M2      6.111        0.512             5.107             7.115
T1          M3      3.778        0.465             2.867             4.689
T2          M1      3.222        0.434             2.372             4.073
T2          M2      2.333        0.408             1.533             3.133
T2          M3      2.111        0.261             1.600             2.622

Estimated Marginal Means for COURSE * MODEL
COURSE   MODEL   Mean    Std. Error   95% Lower Bound   95% Upper Bound
=======================================================================
C1       M1      6.667        1.085             4.540             8.794
C1       M2      5.167        1.195             2.825             7.509
C1       M3      3.833        0.601             2.656             5.011
C2       M1      6.167        1.195             3.825             8.509
C2       M2      4.167        0.792             2.614             5.720
C2       M3      3.167        0.477             2.231             4.102
C3       M1      3.167        0.872             1.457             4.877
C3       M2      3.333        0.882             1.605             5.062
C3       M3      1.833        0.307             1.231             2.436

Estimated Marginal Means for TIMEOFDAY * COURSE * MODEL
TIMEOFDAY   COURSE   MODEL   Mean    Std. Error   95% Lower Bound   95% Upper Bound
===================================================================================
T1          C1       M1          9        0.577             7.868            10.132
T1          C1       M2      7.667        0.333             7.013             8.320
T1          C1       M3          5        0.577             3.868             6.132
T1          C2       M1      8.667        0.882             6.938            10.395
T1          C2       M2      5.667        0.882             3.938             7.395
T1          C2       M3          4        0.577             2.868             5.132
T1          C3       M1      4.667        1.202             2.311             7.022
T1          C3       M2          5        0.577             3.868             6.132
T1          C3       M3      2.333        0.333             1.680             2.987
T2          C1       M1      4.333        0.333             3.680             4.987
T2          C1       M2      2.667        0.882             0.938             4.395
T2          C1       M3      2.667        0.333             2.013             3.320
T2          C2       M1      3.667        0.333             3.013             4.320
T2          C2       M2      2.667        0.333             2.013             3.320
T2          C2       M3      2.333        0.333             1.680             2.987
T2          C3       M1      1.667        0.333             1.013             2.320
T2          C3       M2      1.667        0.882            -0.062             3.395
T2          C3       M3      1.333        0.333             0.680             1.987
}}}

Writing Summary to File

If you are familiar with Python it should be obvious that the class essential has a __str__() method that generates the above output. If you are not as familiar with Python all you need to know is that turning the object into a string (via str(aov)) yields the summary as a big string. This means that writing the summary to a file is pretty straightforward.

>>> with open('output.txt','w') as f:
        f.write(str(aov))
>>>

Working with Anova Objects (Advanced)

If you wish perform additional operations with the results the data from the main effects and

interactions can be accessed directly. The Anova are dictionaries whose keys coorespond to the main effects and interactions.

>>> for d in aov:
        print(d)
('TIMEOFDAY',)
('COURSE',)
('MODEL',)
('TIMEOFDAY', 'COURSE')
('TIMEOFDAY', 'MODEL')
('COURSE', 'MODEL')
('TIMEOFDAY', 'COURSE', 'MODEL')

The values are dictionaries of the various values pertaining to the effect.

>>> from pprint import pprint as pp
>>> pp(aov[('TIMEOFDAY', 'COURSE', 'MODEL')])
{'F': 1.9230769230769222,
 'F_gg': 1.923076923076922,
 'F_hf': 1.923076923076922,
 'F_lb': 1.9230769230769222,
 'ci': 0.80005506708787966,
 'ci_gg': 0.80005506708787966,
 'ci_hf': 0.80005506708787966,
 'ci_lb': 0.80005506708787966,
 'critT': 2.3060041350333709,
 'critT_gg': 2.3060041350333709,
 'critT_hf': 2.3060041350333709,
 'critT_lb': 2.3060041350333709,
 'df': 4,
 'df_gg': 1.1590909090909087,
 'df_hf': 1.1590909090909087,
 'df_lb': 2.0,
 'dfe': 8.0,
 'dfe_gg': 2.3181818181818175,
 'dfe_hf': 2.3181818181818175,
 'dfe_lb': 4.0,
 'eps_gg': 0.28977272727272718,
 'eps_hf': 0.28977272727272718,
 'eps_lb': 0.5,
 'eta': 0.067204301075268813,
 'lambda': 2.8846153846153828,
 'lambda_gg': 2.8846153846153828,
 'lambda_hf': 2.8846153846153828,
 'lambda_lb': 2.8846153846153828,
 'mse': 0.36111111111111122,
 'mse_gg': 1.2461873638344234,
 'mse_hf': 1.2461873638344234,
 'mse_lb': 0.72222222222222243,
 'mss': 0.69444444444444431,
 'mss_gg': 2.3965141612200438,
 'mss_hf': 2.3965141612200438,
 'mss_lb': 1.3888888888888886,
 'obs': 3.0,
 'obs_gg': 3.0,
 'obs_hf': 3.0,
 'obs_lb': 3.0,
 'p': 0.19999514760153031,
 'p_gg': 0.2930377602206829,
 'p_hf': 0.2930377602206829,
 'p_lb': 0.2599000384467513,
 'power': 0.15191672432754222,
 'power_gg': 0.087331953378021465,
 'power_hf': 0.087331953378021465,
 'power_lb': 0.10875612151993008,
 'se': 0.40819136075912227,
 'se_gg': 0.40819136075912227,
 'se_hf': 0.40819136075912227,
 'se_lb': 0.40819136075912227,
 'ss': 2.7777777777777772,
 'sse': 2.8888888888888897,
 'y2': array([ 9.        ,  7.66666667,  5.        ,  8.66666667,  5.66666667,
        4.        ,  4.66666667,  5.        ,  2.33333333,  4.33333333,
        2.66666667,  2.66666667,  3.66666667,  2.66666667,  2.33333333,
        1.66666667,  1.66666667,  1.33333333])}

Hopefully that is enough to get you started.