pyvttbl.stats API


pyvttbl.stats contains a collection of classes for data conducting descriptive and inferential analyses.

Statistics Classes

class pyvttbl.stats.Anova(*args, **kwds)

__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._anova', '__init__': <function Anova.__init__>, 'run': <function>, '_between': <function Anova._between>, '_mixed': <function Anova._mixed>, '_within': <function Anova._within>, '_num2binvec': <function Anova._num2binvec>, '_between_html': <function Anova._between_html>, '_mixed_html': <function Anova._mixed_html>, '_within_html': <function Anova._within_html>, '_summary_html': <function Anova._summary_html>, '__str__': <function Anova.__str__>, '_between_str': <function Anova._between_str>, '_mixed_str': <function Anova._mixed_str>, '_within_str': <function Anova._within_str>, '_summary_str': <function Anova._summary_str>, 'plot': <function Anova.plot>, '__repr__': <function Anova.__repr__>, '__doc__': None, '__annotations__': {}})
__init__(*args, **kwds)
__module__ = 'pyvttbl.stats._anova'
_num2binvec(d, p=0)

Sub-function to code all main effects/interactions

_summary_html(html, factors)
plot(val, xaxis, seplines=None, sepxplots=None, sepyplots=None, xmin='AUTO', xmax='AUTO', ymin='AUTO', ymax='AUTO', fname=None, quality='low', errorbars='ci', output_dir='')

This functions is basically wraps the plot function from the dataframe module. It attempts to find the appropriate error bar term. Creats a filename if necessary and calls plot.

run(dataframe, dv, wfactors=None, bfactors=None, sub='SUBJECT', measure='', transform='', alpha=0.05)

Fancy linear algebra is adapted from a matlab script by R.Henson, 17/3/03

class pyvttbl.stats.Anova1way(*args, **kwds)

__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._anova1way', '__init__': <function Anova1way.__init__>, 'run': <function>, '_tukey': <function Anova1way._tukey>, '_snk': <function Anova1way._snk>, '__str__': <function Anova1way.__str__>, '__repr__': <function Anova1way.__repr__>, '__doc__': None, '__annotations__': {}})
__init__(*args, **kwds)
__module__ = 'pyvttbl.stats._anova1way'
run(list_of_lists, val='Measure', factor='Factor', conditions_list=None, posthoc='tukey', alpha=0.05)

performs a one way analysis of variance on the data in list_of_lists. Each sub-list is treated as a group. factor is a label for the independent variable and conditions_list is a list of labels for the different treatment groups.

class pyvttbl.stats.ChiSquare1way(*args, **kwds)

1-way Chi-Square Test


__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._chisquare1way', '__doc__': '1-way Chi-Square Test', '__init__': <function ChiSquare1way.__init__>, 'run': <function>, '__str__': <function ChiSquare1way.__str__>, '__repr__': <function ChiSquare1way.__repr__>, '__annotations__': {}})
__init__(*args, **kwds)
__module__ = 'pyvttbl.stats._chisquare1way'
run(observed, expected=None, conditions_list=None, measure='Measure', alpha=0.05)
class pyvttbl.stats.ChiSquare2way(*args, **kwds)

__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._chisquare2way', '__init__': <function ChiSquare2way.__init__>, 'run': <function>, '__str__': <function ChiSquare2way.__str__>, '__repr__': <function ChiSquare2way.__repr__>, '__doc__': None, '__annotations__': {}})
__init__(*args, **kwds)
Returns human readable string representation of ChiSquare2way

run(row_factor, col_factor, alpha=0.05)

runs a 2-way chi square on the matched data in row_factor and col_factor.

class pyvttbl.stats.Correlation(*args, **kwds)

bivariate correlation matrix


__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._correlation', '__doc__': 'bivariate correlation matrix', '__init__': <function Correlation.__init__>, 'run': <function>, 'lm_significance_testing': <function Correlation.lm_significance_testing>, '__str__': <function Correlation.__str__>, '__repr__': <function Correlation.__repr__>, '__annotations__': {}})
__init__(*args, **kwds)
__module__ = 'pyvttbl.stats._correlation'
Performs Larzelere and Mulaik Significance Testing on the paired correlations in self.

The testing follows a stepdown procedure similiar to the Holm for multiple comparisons. The absolute r values are are arranged in decreasing order and the significant alpha level is adjusted according to alpha/(k-i+1) where k is the total number of tests and i the current pair.

run(list_of_lists, conditions_list=None, coefficient='pearson', alpha=0.05)
class pyvttbl.stats.Descriptives(*args, **kwds)

__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._descriptives', '__init__': <function Descriptives.__init__>, 'run': <function>, '__str__': <function Descriptives.__str__>, '__repr__': <function Descriptives.__repr__>, '__doc__': None, '__annotations__': {}})
__init__(*args, **kwds)
A Python friendly representation of the analysis

A human friendly representation of the analysis

run(V, cname=None)

Conducts a descriptive statistical analysis of the data in V


V: an iterable containing numerical data


cname: a string to label the data



class pyvttbl.stats.Histogram(*args, **kwds)

__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._histogram', '__init__': <function Histogram.__init__>, 'run': <function>, '__str__': <function Histogram.__str__>, '__repr__': <function Histogram.__repr__>, '__doc__': None, '__annotations__': {}})
__init__(*args, **kwds)
Return str(self).

run(V, cname=None, bins=10, range=None, density=False, cumulative=False)

generates and stores histogram data for numerical data in V

class pyvttbl.stats.Marginals(*args, **kwds)

Calculates means, counts, standard errors, and confidence intervals for the marginal conditions of the factorial combinations specified in the factors list.


key: column label (of the dependent variable)


factors: list of column labels to segregate data

where: criterion to apply to table before running analysis


a pyvttbl.stats. Marginals object


__init__(*args, **kwds)
Returns human readable string representaition of Marginals

run(df, val, factors, where=None)

generates and stores marginal data from the DataFrame df and column labels in factors.

class pyvttbl.stats.Ttest(*args, **kwds)

Student’s t-test


__dict__ = mappingproxy({'__module__': 'pyvttbl.stats._ttest', '__doc__': "Student's t-test", '__init__': <function Ttest.__init__>, 'run': <function>, '__str__': <function Ttest.__str__>, '__repr__': <function Ttest.__repr__>, '__annotations__': {}})
__init__(*args, **kwds)
Return str(self).

run(A, B=None, pop_mean=None, paired=False, equal_variance=True, alpha=0.05, aname=None, bname=None)

Compares the data in A to the data in B. If A or B are multidimensional they are flattened before testing.

When paired is True, the equal_variance parameter has no effect, an exception is raised if A and B are not of equal length.

t =

rac{overline{X}_D - mu_0}{s_D/sqrt{n}}

overline{X}_D is the difference of the averages s_D is the standard deviation of the differences

mathrm{d.f.} = n_1 - 1

When paired is False and equal_variance is True.

t =

rac{ar {X}_1 - ar{X}_2}{S_{X_1X_2} cdot sqrt{ rac{1}{n_1}+ rac{1}{n_2}}}

where: {S_{X_1X_2} is the pooled standard deviation computed as:

S_{X_1X_2} = sqrt{


mathrm{d.f.} = n_1 + n_2 - 2

When paired is False and equal_variance is False.

t = {overline{X}_1 - overline{X}_2 over s_{overline{X}_1 - overline{X}_2}} where:

s_{overline{X}_1 - overline{X}_2} = sqrt{{s_1^2 over n_1} + {s_2^2 over n_2}} where: s_1^2 and s_2^2 are the unbiased variance estimates

mathrm{d.f.} =

rac{(s_1^2/n_1 + s_2^2/n_2)^2}{(s_1^2/n_1)^2/(n_1-1) + (s_2^2/n_2)^2/(n_2-1)}

