Problem description: take a list of 10 elements and run
random.shuffle
on them in Python. There are a total of 10! permutations. Shuffle 5000 times, recording a copy of the list each time.Then to be safe, check for duplicates among the results. Use set to check for dups, but it only works if the items are immutable (so substitute tuples for lists). There are 3 duplicates!
The number of permutations of 10 elements is much larger than 5000.
Confirm this by using another method in the random module:
Get a rough estimate of how likely this is:
Analysis:
One approach is to say that the probability that any two permutations are not the same is:
p = 1.0*3628799/3628800
Very large, but definitely not certainty (p = 1). The number of combinations of 2 elements drawn from 5000 is also large.
The probability that at least l combination is the same (shares a birthday) is:
The other way is to consider a group of two (lists of 10 elements). The probability that they are not the same is (again)
p = 1.0*3628799/3628800
Now, consider a third list. There are only 3628798/3628800 orders available that would not be a duplication, so the total probability for all three is the product of
p = (1.0*3628799/3628800) * (1.0*3628798/3628800)
And the probability of at least one dup is 1 - p. We extend this for a total of N-2 steps:
Close enough.