So let's think about this famous gathering, and ask the question: if the individual chance that Albert and Max share the same birthday is 1/365, what is the probability that there are at least two people in this group that do share the same special day?
The key, as you probably guess, is that this is a combinations problem. Consider a group of 5 individuals (A-E), if F walks up to the group and introduces herself, there are 5 introductions to make.
If we think about building up a group in steps, then for n individuals the number of introductions that have been made is:
This is the problem supposedly solved by Gauss as a young boy.
Another way to think about it is to consider that if each person in the group of n people shakes hands with every other person, there are n * (n-1) hands involved in handshakes, but then we must divide by two to get the unique interactions, since we've counted one hand for both of the interacting partners.
In general, we have the formula for combinations: C(n,k) = n! / (n-k)! k!, where k = 2.
We can solve the birthday problem in a couple of ways. We may say that we have the probability for each pair that they do not share a birthday, which is 364/365. The probability that all the independent combinations do not share a birthday is (364/365)**C(n,2). The probability of the complementary event, that at least one pair does share a birthday, is 1 - (364/365)**C(n,2).
The second approach is to consider the group with 2 people and P = 364/365 that they do not share a birthday. If a new person walks up to the group, there are 363 birthdays which would preserve the "no shared birthday" criterion. The probability of the desired event is then 1 - 364/365 * 363/365..., extended for n-1 steps.
This is easy to program, and I won't bore you with the details, but I will show this pretty plot I made using R:
Now, to the point of the post. I thought about finding a group near the critical size and testing it for the birthday criterion. I'm not such a big sports fan anymore, unless you consider politics a sport. How about the Presidents of the United States? Barack Obama is #44. You can get their vitals from wikipedia, but I found a text version on the web here.
The data needs just a bit of cleanup. One date lacks the comma, one date is listed as April 28th. And one entry has two tabs separating the name and the birthday. And why, exactly, are we presented with Obama's middle name? ("I got my middle name from someone who obviously, never thought I'd be running for President")-video.
Here are the results:
And here is the code: