## Sunday, July 26, 2009

### Statistical doodling: variance

In Bolstad, Chapter 5, there is a proof of the following statement about the variance of independent random variables X and Y.

 `Var(X + Y) = Var(X) + Var(Y)`

There is a lot more discussion here. The post calls this the "Pythagorean Theorem of Statistics", since an equivalent formulation is:

 `SD2(X + Y) = SD2(X) + SD2(Y)`

I don't want to detail the proof, but I did fool around a bit in R to explore this:

 `set.seed(1357)u = rnorm(10000,5,1)var(u)var(u + 7)var(u-250)var(3*u)var(u/5)`

Here is what it prints:

 `> var(u)[1] 0.9962947> var(u + 7)[1] 0.9962947> var(u-250)[1] 0.9962947> var(3*u)[1] 8.966652> var(u/5)[1] 0.03985179`

So, if we add or subtract a constant C, the variance is unchanged. But if we multiply by C, the variance is multiplied by C2; and if we divide by C, the variance is divided by C2.

Now consider a second set of numbers from rnorm. The first vector has a mean of 5 and sd of 2 (variance of 4), while the second has a mean of 4 and sd of 3 (variance of 9).

 `u = rnorm(1000,5,2)v = rnorm(1000,4,3)var(u+v)var(u-v)`

 `> u = rnorm(1000,5,2)> v = rnorm(1000,4,3)> var(u)[1] 3.99337> var(v)[1] 9.470766> var(u+v)[1] 12.76349> var(u-v)[1] 13.65514`

Our simulation confirms the rule that the variances add.

And finally, look at multiplication:

 `u = rnorm(1000,0,1)v = rnorm(1000,0,1)var(u*v)var((u+1)*v)var((u+2)*v)var((u+3)*v)var((u+2)*(v+2))`

The variance depends on the mean of the distributions. Here, the variances of u and v (as well as u + 1,2..3) are always 1. For means of:

 `0,0: var = 1.01,0: var = 2.22,0: var = 5.53,0: var = 10.92,2: var = 9.4`

I found an expression here:

 `Var(XY) = Var(X)*Var(Y) + Var(X)*E[Y]^2 + E[X]^2 Var(Y)`

That is:

 `v(X*Y) = vX*vY + vX*mY2 + mX2*vY`

In the cases above (variance is unchanged and equal to 1) we have:

 `0,0: v(X*Y) = 1 + 1*0 + 0 *1 = 11,0: v(X*Y) = 1 + 1*1 + 0 *1 = 22,0: v(X*Y) = 1 + 1*22 + 0 *1 = 53,0: v(X*Y) = 1 + 1*32 + 0 *1 = 102,2: v(X*Y) = 1 + 1*22 + 22*1 = 9`

Looks correct.