We have been gibbering for many, many pages, at least nominally on the topic ‘Common Design vs Common Descent’. I’d like here to discuss an interesting fact I discovered during the course of this. I mentioned it twice, but readers seemed underwhelmed by what, to me, looks like a genuine scientific discovery (I haven’t searched exhaustively for priority). More importantly for present purposes, it provides a useful test-bed for several concepts that regularly do the rounds.
Sal Cordova had latterly been pursuing an attempt to demonstrate using freely available phylogenetic tools that tetrapods are not descended from aquatic, gilled vertebrates – what we conventionally call ‘fish’. In the process, Sal chose to use gene sequences of what turned out to be cytochrome oxidase (COX) – a component of the electron transport chain in mitochondria and in prokaryote plasma membranes. This molecule is ultimately what that breathing thing is all about – the molecular oxygen we inhale, in diatomic clumps, is held together by electrons shared between the atoms; COX donates electrons passed along from cytochrome c which enable the oxygen atoms to be separated; these react with protons (a different reaction for each of the two atoms) to form water. In the process, additional protons move across the membrane to generate an energetic imbalance that can be tapped, much like a pumped-storage hydro-electric scheme, to make ATP.
Sal thought he was looking at cytochrome c – this molecule is actually one step ‘upstream’ from COX, supplying it with the electrons which the latter passes on to oxygen. In eukaryotes, it so happens that the cytochrome c gene is in nuclear DNA, while COX is in mitochondrial DNA. This creates a significantly different evolutionary environment for the two genes, due to the proximity to reactive molecular species in the latter and the different dynamics resulting from recombination in the former. It would be unwise to draw too many evolutionary conclusions without an awareness of this distinction.
During the course of this discussion, Rumraket took some sequences of the actual cytochrome c, the molecule Sal thought he was looking at, and ran a Uniprot process to generate a phylogeny for a set of sequences. In an idle moment, I thought it might be interesting to explore the taxonomic neighbourhood a bit. It seems to me fundamentally inconsistent to dismiss phylogenetic inference on the grand scale while accepting it more locally. Of course it’s not clear whether any Creationists accept it locally either! I’ve tried repeatedly to get someone to say at what taxonomic level they think Common Descent stops being a valid inference; no luck so far.
So here’s what I did. I was interested to see what the cytochrome c’s of close relatives of the salmon looked like, rather than the grand-scheme analysis being pursued by the ‘I-ain’t-no-fish’ brigade. So I took Rumraket’s sequence for the Atlantic Salmon Salmo salar. The BLAST came up with several paralogs – gene duplicates – in the salmon, and orthologs – genes in separate species – with >90% sequence identity. I say this a lot, but the only process I am aware of that leads to sequence identities in anything like that ballpark is nucleic acid polymerisation. This mostly happens during cell replication (DNA-DNA) and gene transcription (DNA-RNA), somewhat less often by reverse transcription from RNA to DNA. So, absent another process that generates such extensive identity, common descent (at sequence level) has to be a front runner for the reason for any such highly alignable sequences.
So, here are the top few hits for Salmon cytochrome c sequence B5XFR7 (link here but be quick; I don’t think these are archived for long):
Rainbow Trout (Onchorhynchus mykiss) C1BFD3 99%
Salmon paralog B5DFW1 99%
Salmon paralog B9EMZ7 98.1%
Rainbow Trout paralog C1BGL1 98.1%
Rainbow Trout paralog C1BG90 95.2%
Salmon paralog B9EMJ0 94.2%
Rainbow Trout paralog C1BFB8 94.2%
Large Yellow Croaker (Larimichtys crocea) 94.2%
Sea louse (Caligus rogercresseyi) C1BPA2 93.3%
Rainbow smelt (Osmerus mordax) C1BKE6 93.3%
Whoa, back up a bit … the sea louse? That’s a crustacean. Now, I look up this organism and, it turns out, this is a parasite of both Rainbow Trout and Atlantic Salmon, particularly in Chile (yes, they farm Atlantic salmon in the Pacific). So, pending more work, a viable explanation for the high sequence identity with Salmon and Rainbow Trout is gene transfer from one or the other (hard to tell which). If not HGT, it was a remarkable coincidence that the words ‘salmon’ and ‘trout’ should leap off the page when I looked up the organism.
A naive user of free phylogenetic software might try and argue that this anomaly completely destroys the use of such software. After all, a phylogeny based only on cytochrome c might perch this crustacean slap dab in plaice in the middle of the Salmonids (it could also be used to show that salmon are more closely related to trout than they are to their own genes, in clumsy hands …). This is obviously a bit fishy, because we know (ask yourself: how do we know?) that it doesn’t belong there.
So, here are a few questions for Creationists to consider.
1) Do you accept that high sequence identity is reasonably inferred to indicate an origin in a single ancestral DNA sequence?
2) If so, do you agree that the paralogs – multiple copies in the same species – likely share common ancestry with an original single sequence?
3) How about within the family Salmonidae, to which both Atlantic Salmon and Rainbow trout belong (the trout is actually in the same genus as the various Pacific salmon)? Do you think sequence commonality highly likely indicates a genetic relationship here too?
4) Do you agree that the likeliest cause of high sequence identity in the sea louse version is due to gene transfer from salmon or trout, given its intimate association with these species?
5) What makes us think that the sequence in sea lice is anomalous, rather than indicating an actual whole-genome pattern, without even looking at the genome?
6) For broader genomic comparisons, collecting together multiple gene trees, where in the standard taxonomy would you place the division between the region that is reasonably inferred to result from common descent, and that which is not?
Another issue relates to function, and the need or otherwise for functional variation among species or among paralogs. As discussed, the task of cytochrome c and COX is chemically quite low-level – to pass electrons on down a chain. On a design paradigm, the fact that this function needs to vary at all is rather curious, unless we allow that variation can occur through largely neutral mutation. Even stranger is the fact that variation within a single species among its various paralogs is of the order of that between some orthologs in separate genera of the Salmonidae. If one answered ‘yes’ to 2 but ‘no’ to 3, one is saying that the same amount of sequence identity is diagnostic of sequence relationship + mutation in one case, but not the other. One might appeal to evolutionary unnattainability of a transition, but it doesn’t seem to me that the gap between salmon and trout is unbridgeable, starting from a common ancestor. After all, as noted, Pacific ‘salmon’ are more closely related to trout than they are to Atlantic Salmon (I’m curious no other Onchorhynchus species came up, but this may be due to lack of sequences). So, it seems plausible to me that all these gene variants are commonly descended, through fully connected paths of nucleic acid polymerisation.
This pattern of otherwise inexplicable variation of non-morphological genes is repeated throughout the genome and throughout nature. ‘Common-Designists’ would have it that there’s a Psilocybe way of dehydrogenating lactate, and a yeast way, and a bottlenosed dolphin way. And there’s a dandelion way of passing electrons on, and a squirrel way, and an E coli way, and a sea louse way … but hang on; the sea louse way is pretty close to the salmon way, in this gene copy at least. If this is an HGT event, it indicates that there is much less species-specificity in function than some would argue. If HGT, this is a natural experiment, placing a variant in a distant relative and seeing how it fares. Assuming it’s not a pseudogene, the sea louse is apparently able to use the cytochrome c of a very distant relative in addition to its existing repertoire, with no obvious adverse consequences. More work would establish which of its isoforms tend to be preferred, but nonetheless the bare fact rather goes against the expectation that these proteins are highly clade-specific for design reasons. And indeed it does seem much more likely that low-level function can be passed around without excessive penalty than more taxon-specific genes – the variants are mutational fluff, not critically functional. This is one reason why Amino-acyl synthetase phylogenies are actually quite poor – extensive HGT is possible, because really, how many different ways do you need to stick an amino acid on the ACC- end of a tRNA?