Saturday, July 25, 2009

Bayes 9: mean with unknown population variance

Before I solve Chuck's problem from last time, I should show the case where we are trying to estimate the population mean but we do not know its variance. In that situation, it makes sense to calculate the sample variance:

and use that as an estimate of the population variance. And, as you probably know if you are reading this, since there is additional uncertainty as to the population variance, when we use these results to estimate the mean we will need to widen the credible interval by using Student's t table instead of the standard normal table.

So, to continue with the previous example, where Arnie had a normal (30,42) prior. Suppose we have only 5 observations: 31.1,28.2,34.2,35,31.5.

We calculate the sample variance:

In R:
 `v = c(31.1,28.2,34.2,35,31.5)m = mean(v)m# 32w = (v-m)**2sum(w)/4)# 7.335var(v)# 7.335`

In Python:
 `L = [31.1,28.2,34.2,35,31.5]def mean(L): return sum(L)*1.0/len(L)m = mean(L)L = [(x-m)**2 for x in L](sum(L)/4)# 7.335`

 `prior precision = 1/42observation precision = 5/7.335posterior precision = 1/42 + 5/7.335 = 0.0625 + 0.6816 = 0.744posterior variance = 1/precision = 1/0.744 = 1.344posterior st dev = sqrt(variance) = 1.16`

The weights are:

 `prior 0.0625/0.744 = 0.084observation 0.6817/0.744 = 0.916`

The posterior mean is then:

 `mean = 0.084*30 + 0.916*32 = 31.83`

 `library(Bolstad)v = c(31.1,28.2,34.2,35,31.5)normnp(v,30,4,ret=T)`

 `> normnp(v,30,4,ret=T)Standard deviation of the residuals :2.708Posterior mean : 31.8320261Posterior std. deviation : 1.1592201`