Thread: OT-Bad Maths?
View Single Post
  #4   Report Post  
Posted to uk.d-i-y
Big Les Wade Big Les Wade is offline
external usenet poster
 
Posts: 395
Default OT-Bad Maths?

tim...... posted
So according to:

http://www.bbc.co.uk/news/magazine-22310186

"Mathematician" Coralie Colmez claims that if you do a DNA analysis on
an unreliably small piece of evidence and get the same result as the
first time, this strengthens the case that the original result is
correct and uses some coin-tossing explanation to "prove" her case.

I disagree, her corroborating example repeats the same tests on
*different* data whereas doing the DNA test again would be repeating it
on the *same* data (and as such, is completely worthless extra
information).

I think that she has fundamentally missed the point about the
unreliability of the source data here.


The statistics of DNA analysis resembles Colmez' coin-tossing experiment
in some ways, but there are important differences. Whether the coin
analogy is appropriate or not depends on exactly what was wrong with the
scene-of-crime DNA sample. All we are told is that "the DNA sample was
tiny", but that isn't a useful statement.

The way it works is this: An individual's DNA fingerprint consists of a
certain number of parameters (loci) each of which can take several
different values (alleles). You might think of it a bit like someone's
name: for example my name might be expressed as
b i g l e s w a d e
where the allele at locus number 2 is 'i', the allele at locus 7 is 'w'
etc.

So if you find a scene-of-crime DNA sample that is "complete" - i.e. you
can read all of the alleles at all of the loci - you can compare it with
that of your suspect. If they all match, then it's not looking good for
him.

But the trouble with SOC-DNA samples is that they don't always contain a
record of all the alleles at all the loci - or if they do contain it, it
might be smothered by contamination from another source. The smaller the
amount of tissue present at the SOC, the more likely this is to happen.

The equivalent with my personal name analogy is where the police find
the corpse with a knife in his stomach, and just before he died he wrote
(in blood, on the wall next to him), "I was murdered by big**s**de",
where the asterisks are illegible splodges.

Clearly, I might be the murderer. But equally, the victim could have
been trying to name Big Al Spode, a well known local gangster. And it
doesn't matter how many times you examine this writing, you can't decide
whether the real killer is me or Al, simply because that information is
not present in the sample. It just isn't there, and you can't use
statistics to extract information that isn't there.

I'm not saying that this *is* what was wrong with the Kercher evidence,
just that it might be. And if it is, then Colmez' analogy is suspect.

And in fact it's more complicated than that, because certain DNA
analysis techniques can sometimes give you a probabilistic result for a
certain locus. E.g. in my analogy, that the missing fourth letter has a
90 per cent probability of being an "a", which would rule me out and put
Big Al in the frame.

However, such inferences (using what's called low copy number analysis)
are still very controversial, and are at the very least a great deal
more indirect than those from simple full-length DNA matching. Which
itself isn't perfect either, despite what they tell you.

--
Les