And yet the mathematics in the article seems bogus to me, too. Can anyone figure...

mikeash · on April 28, 2013

Confusing "the odds of this occurring with a fair coin" with "the odds that the coin is fair" is by far the most frequent statistical error I see, and by far the least often corrected. It's mildly terrifying.

dyno12345 · on April 28, 2013

Is there a formula you can use to convert between the two?

Symmetry · on April 28, 2013

Yes, but you need to estimate a probability distribution across various types of coin biases first.

http://en.wikipedia.org/wiki/Bayes%27_theorem

Symmetry · on April 28, 2013

Reading some other comments in this thread, I feel that I really ought to have included some more stuff here. There isn't actually such a thing as "the odds that the coin is fair." Either it's fair or it's not. What we can talk about is what probability we should ascribe to it being fair given what we know. Even a single coin flip will have a single result, uniquely determined by the way in which it is launched into the air and caught. Probabilities only exist in the presence of our ignorance of the actual facts, and some people consider probabilities to be themselves a measure of our ignorance of the world.

dyno12345 · on April 29, 2013

Oh I get it

P(H0|observation) = P(observation|H0) * P(H0) / P(observation)

P(observation) = P(observation|H0) * P(H0) + P(observation|!H0) * P(!H0)

A normal significance test tells you P(observation|H0). [Though I'm not sure about P(observation|!H0)]. To apply Bayes' Rule you need P(H0), where P(H0)=???

Evbn · on April 28, 2013

Bayes theorem. The problem is that it requires a prior, which is usually unknowable.

gweinberg · on April 28, 2013

You are exactly right. Saying "there is only an 8% chance of this happening with a fair coin" is something completely different from saying "there is a 92% chance the coin is biased". The author is utterly clueless when it comes to probability.

gburt · on April 28, 2013

This is how frequentist statistics works. You ask the wrong question ("the chance of the data occurring given the assumption") and use clean, rigorous, impeccable math to get an answer. Bayesian statistics is (usually) the opposite - you ask the correct question ("the chance of the assumption being correct") but find that there is no way without making some big assumptions to get to the answer.

Here's some good (possibly more "fair" than I've been above) discussion if anyone wants to read/think about this more:

http://stats.stackexchange.com/questions/22/bayesian-and-fre...

http://www.quora.com/What-is-the-difference-between-Bayesian...

WildUtah · on April 28, 2013

A concise explanation of the difference between Bayesian and Frequentist techniques in statistics:

http://xkcd.com/1132/

gburt · on April 28, 2013

You know, I didn't understand that comic when it was posted (despite feeling like I have an understanding of Bayesian vs. frequentist statistics) and I still don't. So, I looked it up and apparently I'm not the only one.

It seems to me, and the commenters on stats.stackexchange [1] that this comic both misinterprets frequentist statistics and misrepresents Bayesian statistics. I realize that XKCD is a nerdy comic meant to be entertaining - I just wanted to leave this discussion here in case anyone else is confused; I think this is an important distinction and one most people interested in statistics should spend some time thinking about.

[1] http://stats.stackexchange.com/questions/43339/whats-wrong-w...

Edit: I can't edit my first comment now, but gweinberg's post (sibling to the grandparent of this) words the problem perfectly.

pcrh · on April 28, 2013

You are correct. But the reality is even worse.

I don't know the particular DNA test used in this case, but lets assume it gave a certainty of 92% that the DNA isolated was from AK. This means that the particular sequences of DNA identified could have come from another person with a probably of 0.08 (i.e. a one-in-12.5 chance, which is not particularly low in a case like this). It does not mean that the DNA is correctly characterized with a probability of 92%.

For a repeated test to give a different probability, the identity of one or more of the sequences isolated from the sample would have to have been incorrect in one of the assays, i.e. there is a procedural error.

It is not at all like tossing coins. An analogy would be getting someone's eye color as blue the first time and brown the second.

raverbashing · on April 28, 2013

I'm not sure if Bayes theorem is relevant here, (maybe behind the scenes) but you would go probably for http://en.wikipedia.org/wiki/Statistical_hypothesis_testing

hashmymustache · on April 28, 2013

Yea, if I flipped a coin 10,000 times and 55% of them were heads then the coin is definitely biased toward heads, not a 55% chance that it's biased.