Python for Bioinformatics: One more thing

Tuesday, November 9, 2010

One more thing

One thing I forgot to do is test the two_sample_t function. Here we draw samples of different sizes (m=2,n=4) from distributions A and B (normal with the same mean and standard deviation). Since A and B have the same mean, the expected difference is 0. We record how frequently p < 0.05 and it is, as expected, about 5%. The corresponding t-statistic is -2.133, which is also expected, for one-tailed application of the t-test with 4 degrees of freedom.

output:

0.0491 -2.133
0.0516 -2.133
0.0524 -2.133
0.0506 -2.133
0.0502 -2.133

code listing:

from __future__ import division
import numpy as np
from t_test import two_sample_t

mu,sigma = 10,3
m,n = 2,4
alpha = 0.05

N = 10000
tmax = -N
for i in range(5):
    counter = 0
    for j in range(N):
        A = np.random.normal(mu,sigma,m)
        B = np.random.normal(mu,sigma,n)
        t,p = two_sample_t(A,B)
        if p < alpha:  
            counter += 1
            if t > tmax:
                tmax = t
    print '%3.4f' % (counter/N), 
    print round(tmax,3)