Descriptives

This class calculates, reports, and stores summary statistics.

Using class directly

Here we examine data sampled from a normal distribution with a mean of 0 and a standard deviation of 1.

>>> from pyvttbl.stats import Descriptives
>>> from random import normalvariate
>>> desc = Descriptives()
>>> desc.run([normalvariate(mu=0,sigma=1) for i in xrange(1000)])
>>> print(desc)
Descriptive Statistics

==========================
 count        1000.000
 mean            0.025
 mode           -0.182
 var             0.934
 stdev           0.967
 sem             0.031
 rms             0.966
 min            -2.863
 Q1             -0.589
 median          0.004
 Q3              0.681
 max             3.467
 range           6.330
 95ci_lower     -0.035
 95ci_upper      0.085

Descriptives objects inherent collections. OrderedDict

>>> desc
Descriptives([('count', 1000.0),
              ('mean', 0.025036481568892106),
              ('mode', -0.18188273915666869),
              ('var', 0.93438245182138646),
              ('stdev', 0.9666346009849774),
              ('sem', 0.030567670042405695),
              ('rms', 0.9664755013857896),
              ('min', -2.8632575029784033),
              ('Q1', -0.58880378505312103),
              ('median', 0.0040778734181358472),
              ('Q3', 0.68105047745497083),
              ('max', 3.4671371053896305),
              ('range', 6.3303946083680334),
              ('95ci_lower', -0.034876151714223057),
              ('95ci_upper', 0.084949114852007263)],
              cname='')

This means data can be accessed as if the descriptive statistics were stored in a dict.

>>> desc['var']
0.93438245182138646
>>>

Using DataFrame wrapper

>>> df = DataFrame()
>>> df.read_tbl('data/error~subjectXtimeofdayXcourseXmodel_MISSING.csv')
>>> desc = df.descriptives('ERROR')
>>> print(desc)
Descriptive Statistics
  ERROR
==========================
 count        48.000
 mean          3.896
 mode          3.000
 var           5.797
 stdev         2.408
 sem           0.348
 rms           4.567
 min           0.000
 Q1            2.000
 median        3.000
 Q3            5.000
 max          10.000
 range        10.000
 95ci_lower    3.215
 95ci_upper    4.577