Fallacy of the Phylogenetic Signal: Nucleotide Level

Background

For the past month or so I’ve been investigating the claim that the phylogenetic signal is evidence that a dataset shares common descent.

Supposedly, the phylogenetic signal is one of, if not the, strongest pieces of evidence for common descent. It is one of the first of the 29+ evidences for evolution offered over at Talk Origins (TO).

Quoting from the article:

The degree to which a given phylogeny displays a unique, well-supported, objective nested hierarchy can be rigorously quantified. Several different statistical tests have been developed for determining whether a phylogeny has a subjective or objective nested hierarchy, or whether a given nested hierarchy could have been generated by a chance process instead of a genealogical process (Swofford 1996, p. 504). These tests measure the degree of “cladistic hierarchical structure” (also known as the “phylogenetic signal”) in a phylogeny, and phylogenies based upon true genealogical processes give high values of hierarchical structure, whereas subjective phylogenies that have only apparent hierarchical structure (like a phylogeny of cars, for example) give low values (Archie 1989; Faith and Cranston 1991; Farris 1989; Felsenstein 1985; Hillis 1991; Hillis and Huelsenbeck 1992; Huelsenbeck et al. 2001; Klassen et al. 1991).

http://www.talkorigins.org/faqs/comdesc/section1.html#nested_hierarchy

I’ve been skeptical of this claim. A tree is just one kind of directed acyclic graph (DAG), and my hunch is many kinds of DAGs will also score highly on metrics for phylogenetic signal. I picked one metric, the consistency index (CI), which according to Klassen 1991 is the most widely used metric. It also is the featured metric in the above TO article. Plus, it is very simple to calculate. So, I’ve focused my efforts on the CI metric.

Result

What I have found is that my hunch is correct. It is simple to create a DAG that scores highly in CI, well within the range of published CI scores for real datasets.

Consequently, it is incorrect to say the phylogenetic signal is strong evidence for evolution. In particular, this claim is provably false (as I have proven here):

Phylogenies based upon true genealogical processes give high values of hierarchical structure, whereas subjective phylogenies that have only apparent hierarchical structure give low values.

http://www.talkorigins.org/faqs/comdesc/section1.html#nested_hierarchy

How have I proven it false? I generate DNA sequences from directed acyclic graphs, and the trees derived from these sequences using well established methods produce CI scores well within published ranges. Here are two such experiments plotted on the chart from Klassen 1991:

Over plot of 2 DAG experiments (red asterisks) on Klassen 1991 chart of CI values from over 70 studies. DAG experiments are CI scores on synthetic data generated from non-treelike directed acyclic graph (DAG).

This is a phylogeny with very high value of hierarchical structure not generated from a true genealogical process.

Methods

To reproduce my results you can run the DAG dataset generator here: https://repl.it/@EricHolloway/Phylogenetic-Signal-Fallacy-Nucleotide-Level

Take the generated DNA sequences, which are in FASTA format, and paste them into the ClustalW online tool.

Take the results of the ClustalW tool, and use the PAUP software to generate trees and measure CI scores. You’ll need to fiddle with the NEX file format, so to save you the trouble, I’ve included an already created NEX file that I’ve generated from the aforementioned process, which you can pop into PAUP.

Once you load a NEX file into PAUP, here are the steps to generate trees, and then measure CI.

  1. Press “Generate Trees” in the “Trees” menu.
  2. Press the “OK” button.
  3. Press “Describe Trees” in the “Trees” menu.
  4. Press the “Describe” button.
  5. You will see something like the following:
Example analysis from PAUP software of phylogenetic tree created from aligned DAG dataset.

You will find the 27 taxa in the file will generate CI scores in the range of 0.48-0.53. If you look at the Klassen 1991 chart, you will see this is well within published scores for that number of taxa.

Conclusion

So, what is my takeaway from this?

Basically, highly statistically significant CI scores do not indicate common descent. They can just as easily be generated by a DAG. Therefore, we cannot infer common descent from high CI.

Furthermore, insofar as CI is representative of the state of phylogenetic signal measurement, my result undermines the more general claim that phylogenetic signal indicates common descent.

As such, the Talk Origin’s claim that the nested hierarchy of species is well attested by the data is highly questionable if not outright false, and should be retracted as evidence for evolution until such time as a much more rigorous analysis with DAG eliminating controls is established.

Addendum

To visually illustrate what I mean by a DAG generating the DNA sequences, here is a graph of one such DAG. Each colored/numbered box represents a gene, which is replaced by a unique, randomly generated (uniform over ‘GATC’) DNA sequence of 20-30 letters long in post processing. Arrows indicate when ‘ancestor’ gene sets are combined into larger gene sets. If you look closely, you will see each gene set contains the union of all incoming gene sets, plus one new gene. As you can see, this looks nothing at all like an evolutionary process, yet it produces very high phylogenetic signal as measured by the consistency index (CI) metric.

0

131 thoughts on “Fallacy of the Phylogenetic Signal: Nucleotide Level

  1. Alan Fox,

    That’s some mighty strong gibberish. Is that where he posts the stuff he’s ashamed to post here, or what?

    0
  2. John Harshman:
    Alan Fox,

    That’s some mighty strong gibberish. Is that where he posts the stuff he’s ashamed to post here, or what?

    The site list of authors should give us a hint. I get the impression there’s an effort to present a more coherent and intellectual front for “Intelligent Design”*. Whether the effort is working is another question.

    Why Eric posts here? It gives him an opportunity for feedback that echo chambers don’t provide without too much exposure if things go a bit belly-up.

    ETA * than Evolution News & Views or Uncommon Descent.

    0
  3. dazz: Too bad nobody took me up on my bet. LOL
    Holloway, do you think your god approves you blatant dishonesty?

    I just had the brainwave that the reason I’m able to get these high scores with DAGs is because of what the article explains, all the CI scores are actually indicative of large amounts of homoplasy (1-CI = homoplasy index), which can most easily be described by a DAG. And since in the higher taxa counts the HI >> CI, this means a DAG will fit the data much better than a tree.

    The key point is that a perfect tree scores a 1 on the CI metric, and almost none of the studies show perfect tree scores. If it is not a perfect tree, then it fits one of the non-tree DAGs I’ve been mentioning. If the CI is just a little less than 1, then the DAG is probably not too far from a perfect tree. But most of the studies with large numbers of taxa have a CI much, much less than 1, so the DAG is very different from a perfect tree, and is more analogous to the haphazard graph you’d get from human invention, or for similar reasons, the library dependency graph in computer code.

    0
  4. John Harshman: Sorry, but that graph can’t possibly go along with that data set. If each character is a “gene”, which is what you should have, there are way too many genes to be accounted for by the graph. Once again you seem to be doing something quite different from what you imagine.

    It really is important to impress upon you one thing: you are not competent to do what you want to do here.

    If you refer to the couple explanations I’ve posted, the colored boxes are ‘modules’ which can contain multiple characters (either 0/1 presence/absence of property, or GATC for a base letter). So, to avoid the whole alignment discussion, I went back to the binary situation and just used 0/1 and no gaps. The modules will contain multiple positions with 0 or 1, hence there are fewer modules than characters in the matrix.

    It probably would be best if I wrote an article explaining exactly what is going on here.

    Tom English: It may well be that Eric used the term simulation just to make his data generation sound science-y. But I’d like to see how he goes about justifying his use of the term.

    We can call it a generator instead. There is some element of simulation going on, since there is an accumulation of items as one follows a path through the DAG, supposed to be analogous to how, say, human inventions will accumulate features.

    Anyways, as someone else mentioned, the key point here is the structure, in particular that the generating structure is not a tree.

    0
  5. EricMH: We can call it a generator instead. There is some element of simulation going on, since there is an accumulation of items as one follows a path through the DAG, supposed to be analogous to how, say, human inventions will accumulate features.

    Eric, you are not answering Tom’s questions:

    1. give a clear specification of the process you are simulating, and
    2. describe how to validate your simulation of it?

    If you are simulating how human inventions accumulate features, why then apply it to sequence data?

    0
  6. EricMH: It probably would be best if I wrote an article explaining exactly what is going on here.

    That would be difficult, as you clearly don’t understand what is going on. You clearly have no idea what a presence/absence character is, what a 0 or 1 is supposed to indicate, or what, apparently, a taxon is. I’m sure, however, that you will continue to forge ahead in the supreme confidence that you know perfectly well what you’re doing. Your confidence seems impenetrable despite the clear history in this very thread of your multiple failures to do what you imagine you are doing.

    0
  7. Corneel: If you are simulating how human inventions accumulate features, why then apply it to sequence data?

    Tom is right that it should not be called a simulator. Let’s call it a generator. Or a somethingnator.

    The point is the graph structure.

    Still relevant to sequence data. E.g. computer code is sequence data, and accumulates features, which show up as indel events and character substitutions in the git history.

    0
  8. John Harshman: That would be difficult, as you clearly don’t understand what is going on. You clearly have no idea what a presence/absence character is, what a 0 or 1 is supposed to indicate, or what, apparently, a taxon is. I’m sure, however, that you will continue to forge ahead in the supreme confidence that you know perfectly well what you’re doing. Your confidence seems impenetrable despite the clear history in this very thread of your multiple failures to do what you imagine you are doing.

    I would say it probably doesn’t matter too much the level of detail of my understanding. The important thing is I can create a dataset from a non tree structure that scores highly on these metrics, hence showing we cannot infer a tree structure from high scores. That’s about all that I need to do here. Unless you can explain with specifics where you see a particular problem instead of vague obscurantism. At this point I believe I’ve explained what is happening in sufficient detail you can identify any problematic portion if you are actually able to find any issues. I’m doubtful you can, which is why you are just resorting to “you don’t know what you are doing” obscurantism.

    0
  9. EricMH: At this point I believe I’ve explained what is happening in sufficient detail you can identify any problematic portion if you are actually able to find any issues.

    That you think you’ve explained it is another indicator of how lost you are. Your last iteration was entirely undecipherable; looking at the alignment it was impossible to determine whether it actually was an alignment or reflected the process you claimed, which was itself not adequately explained.

    The last comprehensible result you produced was due to bad alignment in which some bits were perfectly aligned (zero homoplasy) and some bits were poorly aligned (lots of homoplasy), the combination producing an intermediate value. But that told us nothing of value, only that mixing two sorts of data gives something in the middle.

    0
  10. EricMH,

    I would say it probably doesn’t matter too much the level of detail of my understanding. The important thing is I can create a dataset from a non tree structure that scores highly on these metrics, hence showing we cannot infer a tree structure from high scores

    Speaking as a strictly amateur sideline sniper, I’d say that if you have any common descent at all, you will generate something with more ‘treeness’ than something in which it is absent, because descent gives trees. ‘Perfect’ descent would give a ‘perfect’ tree, but partial situations with confounding factors – eg homoplasy or HGT – are also possible. So you need a model that excludes any descent processes.

    3+
  11. And another week goes by. Apparently Eric’s only response will be to assert that it’s not important that he know what he’s doing.

    0
  12. John Harshman: And another week goes by. Apparently Eric’s only response will be to assert that it’s not important that he know what he’s doing.

    He was probably eyeballing it.

    On the bright side, he’ll probably publish in BioComplexity and you can respond there. 🙂

    0
  13. Mung: He was probably eyeballing it.

    You could always write a program that declares Eric the winner. Programs are the gold standard, right?

    0
  14. Allan Miller: You could always write a program that declares Eric the winner. Programs are the gold standard, right?

    Yup. Just ask Richard Dawkins. I could indeed write a “Weasel” program that declares Eric the winner.

    You would probably declare “garbage in garbage out.” Or not.

    What do you really think about programs which produce the intended outcome?

    Designed?

    0
  15. Mung: Yup. Just ask Richard Dawkins. I could indeed write a “Weasel” program that declares Eric the winner.

    That’s not what ‘Weasel’ does, as you well know.

    What do you really think about programs which produce the intended outcome?

    Designed?

    All programs are designed, as far as I am aware.

    3+
  16. Mung: What do you really think about programs which produce the intended outcome?

    Do you think that programs that are carefully designed to mimic the behavior of natural processes are somehow thereby misleading us?

    1+
  17. I see that the troll is getting plenty of nourishment right now. Oddly, when fed, trolls only get hungrier.

    0
  18. Joe Felsenstein: Do you think that programs that are carefully designed to mimic the behavior of natural processes are somehow thereby misleading us?

    Certainly not! So long as we agree natural processes are designed to produce certain outcomes.

    0
  19. John Harshman: I see that the troll is getting plenty of nourishment right now. Oddly, when fed, trolls only get hungrier.

    phoodoo: Certainly not! So long as we agree natural processes are designed to produce certain outcomes.

    Troll hungry.

    0
  20. phoodoo: So long as we agree natural processes are designed to produce certain outcomes.

    I’m going to guess that your ‘job’ does not involve logic?

    phoodoo: I would venture to say that there is almost no one on the planet who has more knowledge of this subject than I. Do you know my job?

    Arrogant much?

    0
  21. John Harshman: Apparently Eric’s only response will be to assert that it’s not important that he know what he’s doing.

    Anything further you want on my end? I offered an article explaining my method in more detail, but you said that is unnecessary. I don’t see anything further to do here.

    0
  22. EricMH: Anything further you want on my end? I offered an article explaining my method in more detail, but you said that is unnecessary. I don’t see anything further to do here

    Another ID-Creationist declares victory and runs for the door after getting his latest brain farting nonsense destroyed. What a surprise. 🙂

    0
  23. Mung:
    What do you really think about programs which produce the intended outcome?

    Depends on what this means. For example, if the program is intended to simulate random mutations and selection, the intended outcome could be: that the program really uses some randomizer to produce mutations and then selects and reproduces the best fitted things. In that case the program does what it should do. If the intended outcome is supposed to be: let’s see how far it can go, then the intended outcome would be whatever comes out, which might not necessarily be what we’d expect or like it to be.

    Designed? The program/simulation is designed by definition. Their outcomes, the things we’re asking the simulation about, is what we’re after. Personally, I do not like simulations. Too easy to make fools out of ourselves. They have to be checked very carefully. However, sometimes there’s little more that can be done.

    In Eric’s case here, the problems are built in from the very premises and presumed foundations, not just of the simulation, but of the strawman he’s built to then burn.

    2+
  24. Adapa: Another ID-Creationist declares victory and runs for the door after getting his latest brain farting nonsense destroyed. What a surprise.

    Can you explain the specific technical flaw you see in this particular work?

    0
  25. EricMH: Anything further you want on my end?

    Yes. Genuine engagement with the problems in your “simulation” that have been pointed out here. An acknowledgment that it would indeed be useful, indeed important, for you to have a clear idea of what you were doing. An explanation of what your simulation is proposed to simulate.

    A further description of your method would not be very useful, as the result clearly fails to do anything like what you’ve already claimed for it. Even your attempt at simple, binary presence/absence characters isn’t what you think it is.

    0
  26. EricMH: Can you explain the specific technical flaw you see in this particular work?

    See many previous comments here, none acknowledged in any substantive way.

    3+

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.