## Friday, March 18, 2011

### The Fallacy of Probability Inversion (Part 3 of 4)

 Abe Lincoln in Illinois State Legislature, age 28
A suspect is being tried for burglary. At the scene of the crime, police found a strand of the burglar's hair. Forensic tests showed that it matched the suspect's own hair. The forensic scientist testified that the choice of a random hair producing such a matching is 1/2000. The prosecutor's fallacy is to conclude that the probability of the suspect being guilty must therefore be 1999/2000, damning evidence indeed.

This is certainly incorrect. In a city of 6,000,000 people, the number of people with matching hair samples will be 1/2000 times 6,000,000 = 3000. On the basis of this evidence alone, the probability of the suspect being guilty is a mere 1/3000.

The term "prosecutor's fallacy" was coined by William Thomson and Edward Schumann. In their 1987 article "Interpretation of Statistical Evidence in Criminal Trials", they documented how readily people made this mistake, including at least one professional prosecuting attorney.

Thomson and Schumann also considered an opposite mistake to the prosecutor's fallacy, which they dubbed the defense attorney's fallacy.

In the above example, the defense attorney might argue that the hair-match evidence is worthless, since it only increased the probability of the defendant's guilt by a tiny amount, 1/3000.

If the hair is the only evidence against the suspect, then the pool of potential suspects before the forensic evidence is taken into account is the entire population of the city, 6,000,000, which the new evidence reduces by a factor of 3000, to 2000.

However, one would expect that this is not the only evidence, in which case the initial pool of potential suspects will be much smaller. If it is 4000, say, then the forensic evidence may reduce this by 2000, to 2, increasing the probability of guilt from 1/4000 to 1/2. This is valuable evidence.

The phenomenon of false positives and the prosecutor's fallacy may seem surprising.

Mathematically speaking, the two situations are similar, the ultimate fallacy in both cases is of confusing P(X/Y) with P(Y/X).

If P(X/Y) is very high, people commonly assume that P(Y/X) must be too. In the example above the medical test, P(positive result/disease) = 0.99 while P(disease/positive result) = 0.056. These examples show how wrong this can be.

This fallacy is widespread (even in doctors' surgeries and law courts). Some have argued that it is not merely a common mathematical mistake, but an ingrained cognitive bias within human nature. Either way, an appreciation of this issue is essential for making sense of statistical data in the real world.

From: Mathematics 1001, by Dr. Richard Elwes