*sum of the squares*of k independent, normally distributed random variables with mean 0 and variance 1. For example, in a bivariate distribution where each variable is normally distributed, the square of the distance from the origin should be distributed in this way with df = 2.

I found that I could also get the chi-squared distribution by multiplying two random variables (with mean not equal to 0) together, and it seems that in this case the degrees of freedom is equal to the product of the means.

Let's explore how to use R to solve the problem of the grade distribution from the last post. We have males and females with grades from A,B,D,F in order, stored in a matrix M:

This matches the value given in Grinstead & Snell. We can explore contributions of the individual categories to the statistic as follows.

We see that the disparity in A's was certainly higher than for the other categories, but the p-value (above) is not significant.

We see that the 95th percentile of the chi-squared distribution for df=3 is just a bit larger than 7.8:

We can do a Monte Carlo simulation (I actually do not know yet how the preceding function works, but the simulation should be just like what we did yesterday:

And Fisher's exact test (also on my study list for the future):