## Thursday, August 13, 2009

### Student's t test 2

In this post, I want to work through a manual calculation of the t test.

 `set.seed(157)v = rnorm(10000,5,1)e = sample(v,9)round(e,2)mean(e)sd(e)`

 `> round(e,2) [1] 5.34 4.80 5.56 4.96 4.73 [6] 3.24 4.86 5.14 5.66> mean(e)[1] 4.923185> sd(e)[1] 0.7122304`

Remember that for sd (and var), R uses the sample standard deviation with its correction of n / n-1.

From these values, we can get the standard error of the mean, sem = sd(e)/ √n. Since n = 9, we calculate sem = 0.71/3 = 0.2374

If the sample size were not an issue, we would calculate a confidence interval for a two-sided test of:

 `> x = 1.96 * 0.2374> x[1] 0.465304> 4.923 - x[1] 4.457696> 4.923 + x[1] 5.388304`

And for the one-sided test, since we specified that the null hypothesis has μ >= 6, we multiply by the value (from the last post) using p = 95 %:

 `> x = 1.64 * 0.2374> x[1] 0.389336> 4.923 + x[1] 5.312336`

The small sample correction for the t test means that instead of using values from the standard normal distribution in the calculation above, we get them from the t-distribution, in this case for df (degrees of freedom) = n-1 = 9.

So, instead of using 1.96 (or 1.64) as the multiplier, we use 2.26 (two-sided test, df = 8).

 `> x = 2.30*0.2374> x[1] 0.54602> 4.923 - x[1] 4.37698> 4.923 + x[1] 5.46902`

or, for a one-sided test we multiply by the 95 % level of the t-distribution with df = 8 (1.85):

 `> x = 1.85*0.2374> x[1] 0.43919> 4.923 + x[1] 5.36219`

Since 5.36 is definitely < 6, we can say the result is significant at p = 0.05.

R gives this. Notice they find the p-value at which the distribution becomes > 6. I'm not sure this is kosher, but you see it a lot:

 `v = c(5.34,4.80,5.56,4.96,4.73, 3.24,4.86,5.14,5.66)t.test(v,alt='less',mu=6)`

 `> t.test(v,alt='less',mu=6) One Sample t-testdata: v t = -4.5433, df = 8, p-value = 0.0009454alternative hypothesis: true mean is less than 6 95 percent confidence interval: -Inf 5.362691 sample estimates:mean of x 4.921111`