ChiSquare1way

ChiSquare1way conducts a chi-squared test on a list of frequencies.

A simple example

>>> from pyvttbl.stats import ChiSquare1way
>>> x2 = ChiSquare1way()
>>> x2.run([17, 19, 18, 20, 32, 20])
>>> print(x2)
Chi-Square: Single Factor

SUMMARY
           A    B    C    D    E    F
======================================
Observed   17   19   18   20   32   20
Expected   21   21   21   21   21   21

CHI-SQUARE TESTS
                     Value   df     P
=======================================
Pearson Chi-Square   7.238    5   0.204
Likelihood Ratio     6.517    5   0.259
Observations           126

POST-HOC POWER
       Measure
==============================
Effect size w            0.240
Non-centrality lambda    7.238
Critical Chi-Square     11.070
Power                    0.516

If only one argument is provided the expected counts are assumed to be evenly distributed amongst the possible outcomes. Unequal expected outcomes can be specified using the expected keyword.

>>> x2.run([17, 19, 18, 20, 32, 20], expected=[10,10,10,10,10,102])
>>> print(x2)
Chi-Square: Single Factor

SUMMARY
           A    B    C    D    E     F
=======================================
Observed   17   19   18   20   32    20
Expected   10   10   10   10   10   102

CHI-SQUARE TESTS
                      Value    df   P
=====================================
Pearson Chi-Square   143.722    5   0
Likelihood Ratio     100.590    5   0
Observations             126

POST-HOC POWER
       Measure
===============================
Effect size w             1.068
Non-centrality lambda   143.722
Critical Chi-Square      11.070
Power                         1
>>>

Meaningful column labels can also be supplied to the analysis.

>>> x2.run([17, 19, 18, 20, 32, 20], conditions_list=[1,2,3,4,5,6])
>>> print(x2)

Chi-Square: Single Factor

SUMMARY

1 2 3 4 5 6

CHI-SQUARE TESTS

Value df P

Observations 126

POST-HOC POWER

Measure

Critical Chi-Square 11.070 Power 0.516 >>>

Running analysis from a DataFrame

DataFrame has a wrapper method for conducting a one-way chi-squared analysis. It assumes the data is denote individual categorical events and will calculate the frequencies for the user.

>>> from pyvttbl import DataFrame
>>> from random import randint # simulate a 6-sided die
>>> df = DataFrame()
>>> for i in _xrange(1000):
    df.insert([('roll',i), ('outcome', randint(1,6))])
...
SyntaxError: invalid syntax
>>> for i in _xrange(1000):
    df.insert([('roll',i), ('outcome', randint(1,6))])

>>> print(df.descriptives('outcome'))
Descriptive Statistics
  outcome
==========================
 count        1000.000
 mean            3.536
 mode            5.000
 var             2.960
 stdev           1.720
 sem             0.054
 rms             3.932
 min             1.000
 Q1              2.000
 median          4.000
 Q3              5.000
 max             6.000
 range           5.000
 95ci_lower      3.429
 95ci_upper      3.643
>>> x2 = df.chisquare1way('outcome')
>>> print(x2)
Chi-Square: Single Factor

SUMMARY
              1         2         3         4         5         6
====================================================================
Observed       164       172       147       169       177       171
Expected   166.667   166.667   166.667   166.667   166.667   166.667

CHI-SQUARE TESTS
                     Value   df     P
=======================================
Pearson Chi-Square   3.320    5   0.651
Likelihood Ratio     3.402    5   0.638
Observations          1000

POST-HOC POWER
       Measure
==============================
Effect size w            0.058
Non-centrality lambda    3.320
Critical Chi-Square     11.070
Power                    0.244

And we fail to reject the null.

Direct access to results

Like many of the pyvttbl.stats objects ChiSquare1way inherents an OrderedDict.

ChiSquare1way([('chisq', 7.238095238095238),
               ('p', 0.20352651555710358),
               ('df', 5),
               ('lnchisq', 6.517343547697185),
               ('lnp', 0.25907994276152624),
               ('lndf', 5),
               ('N', 126),
               ('w', 0.23967728365938887),
               ('lambda', 7.238095238095237),
               ('crit_chi2', 11.070497693516351),
               ('power', 0.51617215660330651)],
               conditions_list=[1, 2, 3, 4, 5, 6])