View Single Post
  #794   Report Post  
John Harshman
 
Posts: n/a
Default Some Thought On Intelligent Design - WAS: OT Is George BushDrinking?

Fletis Humplebacker wrote:

John Harshman wrote:

Fletis Humplebacker wrote:



It's time to whittle the posts down for the sake of brevity,
you are already at 43k.


I was recently trying to figure out why you never responded to my
evidence for human evolution, so I looked back in the thread. The reason
is that you deleted the whole thing without comment, even though you in
fact asked me to give you that evidence. I know this was a mere
oversight on your part, and I have thoughtfully restored it below:

[You need to view this in a font in which all the characters take up the
same amount of room. If you view it in a proportionally-spaced font,
both the tree and the DNA sequence will fail to line up properly.]

Evidence for human relationships to the other apes.

But first, a primer on DNA and how it can be used to understand
phylogenetic relationships. If you understand
this already, skip ahead to "Here is a set of DNA sequences" below the
dotted line.

DNA is double helix, each half being a twisted string of chemicals,
called bases or nucleotides, on a backbone. The bases come in four
flavors, each with a slightly different chemical formula, which can be
represented as single letters: A, C, G, or T, from the first letters of
each chemical's name. Because each of the two strings completely
determines the other one, we can ignore one of them, and because of
DNA's beads-on-a-string structure, we can completely describe a given
gene by a linear sequence of the four bases. So if I tell you that the
DNA sequence in some gene in some species is AAGAAGCTAGTGTAAGA, I have
completely described that particular part of the DNA molecule.

Different species have slightly different sequences, and when we line up
the corresponding sequences from different species, the patterns of
bases (letters) at each position (or site) in the sequence can tell us
about their relationships. Consider a set of 5 species. At any
particular position in the sequence each species has either A, C, G, or
T. For my purposes I don't care about the particular bases, only about
the patterns of similarity, so I'm going to use a different symbolism to
describe those patterns. I'll use lower case letters to represent
identical bases. So if I say a position has pattern xxxyy, I mean that
the first three species have one base and the last two have another. The
real bases could be TTTCC, GGGAA, or any other combination. There
are many possible patterns: xxxxx, xyzyz, xyxyy, etc. But only a few of
them can be used to determine relationships. It should be obvious that
xxxxx, all bases the same, tells us nothing. If only one base differs,
such as xyxxx, that also tells us nothing except that one species is
different from all the rest; but we already knew it was a separate
species. The only patterns that make a claim of relationships are those
in which two species have one base, and the other three have another:
xxyyy, xyxyx, xxyxy, and so on. (Actually, patterns like xxyzz tell us
something too, just not enough for my current purposes.) Why is this?
Because such patterns split the species into two groups, implying a tree
that looks something like this:

y x If all the species on the left have state y, and
\ / all the species on the right have state x, then
\ / somewhere in the middle (the branch marked *),
y__\_____/ there must have been a change in that site --
/ * \ a mutation -- either from y to x or x to y
/ \ (we can't tell which from this information).
/ \
y x

A little further note: the patterns that I represent in rows above
(xxyyy, etc.) are shown in columns in the DNA sequences below. That is,
in the sequences below, you read across to find the sequence in a single
species, but you read down to read the contents of a single site in five
species. So the first column of the sequence, reading down, would be
AAGAG, which is an xxyxy pattern.

-------------------------------------------------------

Here is a set of DNA sequences. They come from two genes named
ND4 and ND5. If you put them together, they total 694 nucleotides. But
most of those nucleotides either are identical among all the species
here, or they differ in only one species. Those are uninformative about
relationships, so I have removed them, leaving 76 nucleotides that make
some claim. I'll let you look at them for a while.

[ 10 20 30 40 50]
[ . . . . .]
+ 1 2++ 3 11 +4 3 ++ 52+1 2615+4 14+ 3 3+6+
gibbon ACCGCCCCCA TCCCCTCCCT CAAGTCCTAT CCAATCTACT GTACTTTGCC
orangutan ACCACTCCCA CCCTTCCTCC TAAGACTCAC ACAACTCGCC ACACCTCGTC
human GTCATCATCC TTCTTTTTTT AGGAATTTCC TCTCTCCGTC ACGCTCTACT
chimpanzee ATTACCATTC CTTTTTTCCC CGGATTCTCC CTTCTTCATT ATGTCTCATT
gorilla GTTGTTATTA CCTCCCTTTC AAGAACCCCT TTCACCTATC GCGTCCCACT

[ 60 70 ]
[ . . ]
+++ +++1 + + 2 + +++
gibbon CCTACAGCCC AGCCAAACGA CACTAA
orangutan CCTACCGCCT AGCCATTTCA CACTAA
human CCCCTTATTT TCTTGTCCGG TGACCG
chimpanzee TTCCTCATTT TCTTACTCAG TGACCG
gorilla TTCCTTATTC TTTCGCCTAG TGATTA

I've marked with a plus sign all those sites at which gibbon and
orangutan match each other, and the three African apes (including
humans) have a different base but match each other. (That's the xxyyy
pattern mentioned above) These sites all support a relationship among
the African apes, exclusive of gibbon and orangutan. You will note there
are quite a lot of them, 23 to be exact. The sites I have marked with
numbers from 1-6 contradict this relationship. (Sites without numbers
don't have anything to say about this particular question.) We expect a
certain amount of this because sometimes the same mutation will happen
twice in different lineages; we call that homoplasy. However you will
note that there are fewer of these sites, only 22 of them, and more
importantly they contradict each other. Each number stands for a
different hypothesis of relationships; for example, number one is for
sites that support a relationship between gibbons and gorillas, and
number two is for sites that support a relationship between orangutans
and gorillas (all exclusive of the rest). One and two can't be true at
the same time. So we have to consider each competing hypothesis
separately. If you do that it comes out this way:

hypothesis sites supporting pattern
African apes (+) 23 xxyyy
gibbon+gorilla (1) 6 xyyyx
orangutan+gorilla (2) 4 xyxxy
gibbon+human (3) 4 xyxyy
gibbon+chimp (4) 3 xyyxy
orangutan+human (5) 2 xyyxx
orangutan+chimp (6) 2 xyxyx

I think we can see that the African ape hypothesis is way out front, and
the others can be attributed to random homoplasy. This result would be
very difficult to explain by chance.

Let's try a statistical test just to be sure. Let's suppose, as our null
hypothesis, that the sequences are randomized with respect to phylogeny
(perhaps because there is no phylogeny) and that apparent support for
African apes is merely a chance fluctuation. And let's try a chi-square
test. (I'm not going to explain chi-square tests here; just understand
that it's a statistical test that tells us the probability that we would
see the patterns we see if sequence differences were random.) Here it is:

hypothesis obs. exp. (obs.-exp)^2/exp.
African apes (+) 23 6.29 44.4
gibbon+gorilla (1) 6 6.29 0.0
orangutan+gorilla (2) 4 6.29 0.8
gibbon+human (3) 4 6.29 0.8
gibbon+chimp (4) 3 6.29 1.7
orangutan+human (5) 2 6.29 2.9
orangutan+chimp (6) 2 6.29 2.9
sum 44 44 53.7*

(*This column is rounded, so it doesn't quite add up here.)

These are all the possible hypotheses of relationship, and the observed
number of sites supporting them. Expected values would be equal, or the
sum/7. The important column is the third one, which is a measure of the
"strain" between the observed and expected values. The larger the sum of
this column ("the sum of squares"), the greater the strain. There are 6
degrees of freedom (meaning that if we know 6 of the observations, we
automatically know the 7th), and the sum of squares is 53.7. That last
number gets compared to a chi-square distribution to come up with a P value.

It happens that P, or the probability of this amount of asymmetry in the
distribution arising by chance, is very low. When I tried it in Excel, I
got P=8.55*10^-10, or 0.000000000855. That's pretty close to zero, and
chance can be ruled out with great confidence.

Having ruled out chance, now the question is how you account for the
pattern we see. I account for it by supposing that the null hypothesis
is just plain wrong, and that there is a phylogeny, and that the
phylogeny involves the African apes, including humans, being related by
a common ancestor more recent than their common ancestor with orangutans
or gibbons. How about you?

By itself, this is pretty good evidence for the African ape connection.
But if I did this little exercise with any other gene I would get the
same result too. (If you don't believe me I would be glad to do that.)
Why? I say it's because all the genes evolved on the same tree, the true
tree of evolutionary relationships. That's the multiple nested hierarchy
for you.

So what's your alternative explanation for all this? You say...what?
It's because of a necessary similarity between similar organisms? But
out of these 76 sites with informative differences, only 18 involve
differences that change the amino acid composition of the protein; the
rest can have no effect on phenotype. Further, many of those amino acid
changes are to similar amino acids that have no real effect on protein
function. In fact, ND4 and ND5 do exactly the same thing in all
organisms. These nested similarities have nothing to do with function,
so similar design is not a credible explanation.

God did it that way because he felt like it? Fine, but this explains any
possible result. It's not science. We have to ask why god just happened
to feel like doing it in a way that matches the unique expectations of
common descent.

By the way, if you want to see the full data set I pulled this from, go
he

http://www.treebase.org/treebase/console.html

Then search on Author, keyword Hayasaka. Click Submit. You will find
Hayasaka, Kenji. Then click on Search. This brings up one study, in the
frame at middle left. Click on Matrix Fig. 1 to download the sequences.
You can also use this site to view their tree. The publication from
which all this was drawn is Hayasaka, K., T. Gojobori, and S. Horai.
1988. Molecular phylogeny and evolution of primate mitochondrial DNA.
Mol. Biol. Evol., 5:626-644.