When you look at the data, it appears to be fairly irregular. You should expect around 16-17% for each face of the die, and with one of his sets, he gets a percentage of 22%! I won't get too much into the meat and potatoes of the statistics here, but I took the original poster's data and ran it through a Chi-square goodness of fit test (By the way, for all you haters, this sample size is plenty big). Assuming the dice are fair to begin with (a standard assumption with any hypothesis test), there is a 12.13% chance that you would get similar results or a more extreme result. This means that I have (roughly....) a 10% chance of any face showing up 22% of the time or more! It's only for very large sample sizes (read: infinite sample sizes), where you expect to have exactly 16.67% proportions for each face of the die.
Just for fun, I did another goodness of fit test with the OP's second set of dice. Instead of asking "what's the probability that I get a more extreme result" like I did last time, I asked "what's the probability of getting a more consistent result." Our assumption that the dice are fair remain the same. Guess what? The probability that we get this level of a result or a more consistent result (that is, proportions closer to 16.67%) is only 8.78%. It's very very rare to take a finite sample size and get perfect proportions.
To illustrate this point, I'd like to bring up a childhood anecdote. On my TI-83 plus from high school, I got an application called "probability simulator." When I was bored in class, I would call up this application and simulations. Simulations with thousands of trials. Despite this, I never got a perfectly uniform bar graph. Why? Because probabilities address only long-term proportions. And long term does not mean a thousand trials. It doesn't even mean a million trials. Long term means an infinite number of trails.
Click to embiggen |
Can you test for this? Of course not! That would be silly! Nearly as silly as this xkcd. (the "p" in the comic is the probability that you saw an increased chance of acne assuming that jelly beans are not correlated with acne. .05 means 5%. There. Now you can read it and understand!).
Of course, the joke here is that they tested 20 different jelly bean colors and one of those tests was significant. 1/20 is 5%. It's bound to happen that, if you do an experiment enough, you get the significant results. That's the nature of the game. And this would directly apply to a tournament testing dice. If I have 64 people in a tournament and I'm testing at a 5% level, I would, on average, throw out 3 of them for having unfair dice, even if they had fair dice.
There's one last topic I'd like to address that was brought up in the thread. That's the idea of "precision dice." Why do people sell "precisions dice" if they're just as good as normal, cheapo dice? Well, probably because people want to buy into it. It's been shown time after time that tap water is at least as safe as bottled water (and/or that bottled water comes from a city's municipal water supply....in layman's terms...tap water), yet the bottled water industry is wildly successful. It's all about how you sell your product, and Penn and Teller do a marvelous job of showing it.