One Sample t-test
This is the fourth in a series of posts about Student's t test. The previous posts are here , here and here.
There are several different versions of the t-test, but the simplest is the one sample test. We will have a single sample with a (relatively) small number of values. We calculate the mean and the variance and then the standard deviation of the sample values. Importantly, the variance and standard deviation are the unbiased versions, in which the sum of squares is divided by n-1 rather than n.
It's easy to see that we should not use numpy's var for this test:
As
help(np.var)
indicates:In the code below we do:
So, at least conceptually, at this point we have a z-score for the sample mean. With a standard z-score, we would compare it to the normal distribution. Here we do two things differently: we multiply by √n (the square root of the number of samples), and we find the value of the resulting t statistic in the t distribution.
The result indicates that, given a prior choice of a limit of 0.05 (for this one-sided test), the null hypothesis that the mean of the sample values is equal to 3 is not supported (just barely).
You can read much more about the background of the test here.
The second set of values in the code below gives:
It's reassuring that a One Sample t-test in R gives a similar result. I'll have more to say about "tails" and sidedness in another post.
Obviously, we could use a lot more testing.
code listing:
utils.py