John Sanford’s NIH Presentation, Kirk Durston’s Paper

On Thursday 10/18/18 John Sanford presented at the prestigious Mazur Auditorioum of the National Institutes of Health:

https://calendar.nih.gov/app/MCalInfoView.aspx?EvtID=36417


Recordings are being edited, formatted and finalized and will be, God willing, released in due time.

On 10/19/18 the next day John and I conferenced with Kirk Durston, the lead author of the following paper which lays some of the ground work for one of the purposes of the nested hierarchy in biology which is makes discovery of 3D protein structure because of the patterns of sequence diversity and similarity.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3524763/

Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring

Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families.

84 thoughts on “John Sanford’s NIH Presentation, Kirk Durston’s Paper

  1. On Thursday 10/18/18 John Sanford presented at the prestigious Mazur Auditorioum of the National Institutes of Health:

    Suzan Mazur? That would be hilarious.

  2. John Sanford presented at the prestigious Mazur Auditorioum of the National Institutes of Health

    How extremely Sal Cordova is it that the auditorium has to be referred to as “prestigious”?

  3. one of the purposes of the nested hierarchy in biology which is makes discovery of 3D protein structure because of the patterns of sequence diversity and similarity.

    How do you know that’s one of the purposes? What are the other purposes? What are the purposes for having the same nested hierarchy in different proteins, in introns, and in junk DNA in general?

  4. John Harshman: How extremely Sal Cordova is it that the auditorium has to be referred to as “prestigious”?

    The Masur Auditorium is usually offered to distinguished speakers. The day before, it hosted Nobel Prize winner Michael Young.

  5. John Harshman: How do you know that’s one of the purposes? What are the other purposes? What are the purposes for having the same nested hierarchy in different proteins, in introns, and in junk DNA in general?

    You’ve been asking, it seems to me, why a designer would make a nested hierarchy. Offering a POSSIBLE rational answer shows there could be in principle other answers as to why God did something a certain way, and that is sufficient to show there is no need to assume common descent because we can’t think of reasons why God would make a nested-hierarchy.

    Your question assumes we need to know for sure why there is nested hierarchy in order to dismiss common descent. We don’t. Unevolvability is a good enough reason. You responded with miracles of creation along the way to make common descent possible and overcome problems with unevolvability, but then kept saying you need and explanation as to why God would make the nested hierarchy. So one is provided. It won’t satisfy you, obviously. Why are their statues on Easter Island? Would you accept design in that case without knowing the purpose for the design?

    But, what Kirk has shown is that the nested hierarchy, like fine tuning of the universe, is a feature that helps scientific discovery. That is to say, God makes it possible to understand humans by studying creatures that are not human. Kirk was able to make inferences by scanning ubiquitins from 1000 species. In the process he found an algorithm that creates road signs and navigation aids to understanding protein function and folding that is a few steps above the old-style study of “conserved” regions. He has thus biochemical predictions that can be tested.

    I’ll provide a relevant quote by Dembski in the next comment.

  6. Dembski foresaw the possibility of things like Kirk Durston’s work:

    10. Steganography
    Finally, we come to the research theme that I find most intriguing. Steganography, if you look in the dictionary, is an archaism that was subsequently replaced by the term “cryptography.” Steganography literally means “covered writing.” With the rise of digital computing, however, the term has taken on a new life. Steganography belongs to the field of digital data embedding technologies (DDET), which also include information hiding, steganalysis, watermarking, embedded data extraction, and digital data forensics. Steganography seeks efficient (that is, high data rate) and robust (that is, insensitive to common distortions) algorithms that can embed a high volume of hidden message bits within a cover message (typically imagery, video, or audio) without their presence being detected. Conversely, steganalysis seeks statistical tests that will detect the presence of steganography in a cover message.

    Consider now the following possibility: What if organisms instantiate designs that have no functional significance but that nonetheless give biological investigators insight into functional aspects of organisms. Such second-order designs would serve essentially as an “operating manual,” of no use to the organism as such but of use to scientists investigating the organism. Granted, this is a speculative possibility, but there are some preliminary results from the bioinformatics literature that bear it out in relation to the protein-folding problem (such second-order designs appear to be embedded not in a single genome but in a database of homologous genomes from related organisms).

    While it makes perfect sense for a designer to throw in an “operating manual” (much as automobile manufacturers include operating manuals with the cars they make), this possibility makes no sense for blind material mechanisms, which cannot anticipate scientific investigators. Research in this area would consist in constructing statistical tests to detect such second-order designs (in other words, steganalysis). Should such second order designs be discovered, the next step would be to seek algorithms for embedding these second-order designs in the organisms. My suspicion is that biological systems do steganography much better than we, and that steganographers will learn a thing or two from biology — though not because natural selection is so clever, but because the designer of these systems is so adept at steganography.

    Such second-order steganography would, in my view, provide decisive confirmation for ID. Yet even if it doesn’t pan out, first-order steganography (i.e., the embedding of functional information useful to the organism rather than to a scientific investigator) could also provide strong evidence for ID. For years now evolutionary biologists have told us that the bulk of genomes is junk and that this is due to the sloppiness of the evolutionary process. That is now changing. For instance, Amy Pasquenelli at UCSD, in commenting on long stretches of seemingly barren DNA sequences, asks us to “reconsider the contents of such junk DNA sequences in the light of recent reports that a new class of non-coding RNA genes are scattered, perhaps densely, throughout these animal genomes.” (“MicroRNAs: Deviants no Longer.” Trends in Genetics 18(4) (4 April 2002): 171-3.) ID theorists should be at the forefront in unpacking the information contained within biological systems. If these systems are designed, we can expect the information to be densely packed and multi-layered (save where natural forces have attenuated the information). Dense, multi-layered embedding of information is a prediction of ID.

  7. stcordova: Offering a POSSIBLE rational answer shows there could be in principle other answers as to why God did something a certain way, and that is sufficient to show there is no need to assume common descent because we can’t think of reasons why God would make a nested-hierarchy.

    No I’m afraid you really do need actual evidence to substantiate why one imaginary possibility is to be believed over another. I can conjecture reasons why invisible pixies, or unicorns, or teletubbies, or extragalactic alien civilizations would want to create nested hiearchies, that doesn’t mean they are to be believed for that reason alone.

    You don’t “defend” a belief you already hold against contradictory evidence by merely coming up with ad-hoc conjectures that explain the evidence away. There isn’t any fact of reality that can’t be defended in that way.

    There isn’t some fact, real or imaginary, for which it isn’t possible to rationalize why some mysterious designer would want to create it that way. Here’s one that fits anything you can imagine: S/he gets off on it.

    Or we can do the same for non-design explanations: The system is fundamentally indeterministic and it just happened to be that way by chance.

    Or how about: The system is completely deterministic, and the initial conditions were simply such that the presently observed phenomenon was an unavoidable consequence of the rules playing themselves out.

    None of these assertions, whether provided in defense of design or non-design, should count as a successful defense against evidence that contradict a hypothesis. Unsubstantiated ad-hoc rationalizations aren’t a valid form of counter-argument. Learn to recognize even in your own thinking when you start to engage in ad-hoc reasoning.

  8. stcordova: Dembski foresaw the possibility of things like Kirk Durston’s work

    I like how in fact Dembski didn’t “foresee” this, but is merely piggybacking on evolutionary biology and comparative biochemistry where it has been known for like a century that comparative biology gives useful insights in how living organisms function. But I see he managed to fool you by speaking vaguely enough that he can pretend to have “predicted” something by never actually making a specific prediction, nor explaining why it is a necessary consequence of intelligent design. This is textbook hucksterism. Foresee something already known, but vaguely, and don’t give dates or quantities.

    Nobody predicted nested hierarchies before Linnaeus found them and evolutionary biologists explained them, but now creationist propagandists are tripping over each other trying to claim they knew it all along, approaching 300 years after the fact. Hilarious.

  9. I have spoken in the Throne Room of the Hofburg Palace in Vienna. Is that “prestigious” enough? (There was no throne there when I was there).

    Actually, that Throne Room was not completed until 1923, by which time the Austro-Hungarian monarchy had disappeared. The slightly-smaller room down the hall was the actual Throne Room when Emperor Franz Joseph ruled, when my grandfather and grandmother were his subjects. I didn’t speak in that room, but I did use the bathroom across the hall from it, which is presumably the place where Franz Joseph “sat on the throne” (I asked one of the locals and he said that yes, they have the same idiom in German). Very prestigious,

  10. To echo Rumraket, based on the abstract Kirk Durston’s talk seems to be standard phylogeny-based comparative biology. Nothing wrong with it, unless you imply that Durston invented this approach.

  11. I’ve sat in the Royal Danish Air Force helicopter reserved exclusively for transport of the Danish royal family, in the specific seat used by her royal majesty Margrethe II of Denmark, when I was a conscript in the military back in 2004. She is a smoker, and has a custom made ashtray welded to the inside of the fuselage next to her seat. Can I claim to be important now?

  12. It’s an elaborate strategy, to provide two indistinguishable causes of the nested hierarchy. Because there must be some level at which it’s all due to common descent.

  13. John Harshman: How extremely Sal Cordova is it that the auditorium has to be referred to as “prestigious”?

    My exact reaction. I wonder if Sanford, might, in some way, be eminent?

  14. Durston used and algorithm found in the insurance industry, it wasn’t a phylogeny program, it doesn’t assume common ancestry except maybe for some of the alignment. It might even be better to do alignments that can drop the common ancestry assumption!

  15. Allan Miller:
    This would be better as 2 OPs btw.

    Since my OP’s go off topic anyway, I figured I’d get it off to a faster start this time.

    But, a tad more seriously, we’re in post production on the video with the slide to be integrated in. John wants the youtube released and we can discuss his presentation in detail then. I just wanted to announce the good news, but until the video is out, the substantive thing that is presently available is Durston’s work.

    Whether it satisfies John Harshman’s demand for an explanation for why God made nested hierarchies is not the main point. It’s a good enough POSSIBLE explanation for the patterns of similarity and diversity, namely it makes scientific discovery possible. The progression of simple organisms, like bacteria, to complex organisms like humans is very nice progression in understanding ourselves. Ah such a beautiful antrhopocentric perspective on things to boot.

  16. As far as Kirk’s paper, it was his PhD dissertation, and I managed to talk him into re-visiting it. I felt this would be good stuff for structural biologists whatever one felt about ID.

    My purpose in bringing it forward was to get it a little more vetted and re-worked to be more conceptually accessible. We may do a re-write of the software to a more modern language in the future.

    Since the crew here at TSZ is so motivated to discredit anything we say, we get a steady flow of free-of-charge peer-review and consultation. 🙂

  17. Rumrakete:

    Nobody predicted nested hierarchies before Linnaeus

    I would agree as far as saying ID doesn’t predict nested hierarchies, nor did it predict functionality in much of DNA.

    That said, Evolutionary theory doesn’t predict nested hierarchies defined by Taxnomically Restricted Genes and features that are un-evolvable either. So take that!

  18. Since the NIH has lots of medical doctors, and since lots of medical doctors are creationists and/or ID symathetic, John received a warm welcome there. After all the topic to genetic loss is relevant to the mission of the NIH.

  19. Just as an FYI, I pointed out to Kirk and John Sanford, that in the course of my journey, I found out the Histone 3 protein is almost 100% identical in all mammals,and 99% identical between humans and Arabidops Thaliana (a plant), and 90% similar to yeast. Where the heck did evolutionary theory predict that?

  20. Now, something that Kirk found out after his dissertation.

    If the cross species protein alignments didn’t result in clustering, it turned out these were disordered proteins (yes, not all proteins have stable folds!).

    Now look at the histone 3 protein, right away, there is not real clustering to speak of, or not much of it (in a sense the whole thing is clustered). Well, well, the important thing about histone 3 is not it’s folding but it’s linear UNFOLDED tail!

    So where does evolutionary theory predict that unclusterable sequences will be disordered proteins and clusterable sequences will be ordered? Bwahaha!

  21. stcordova: The Masur Auditorium is usually offered to distinguished speakers. The day before, it hosted Nobel Prize winner Michael Young.

    So?

  22. stcordova: You’ve been asking, it seems to me, why a designer would make a nested hierarchy. Offering a POSSIBLE rational answer shows there could be in principle other answers as to why God did something a certain way, and that is sufficient to show there is no need to assume common descent because we can’t think of reasons why God would make a nested-hierarchy.

    I notice you didn’t answer any of the questions I asked, as is typical of you. Sorry, but that isn’t actually a rational answer to the question you just asked yourself. Faking evidence of common descent just so we can research protein function is absurd, not rational. Nor does the supposed fine-tuning of the universe do anything to make scientific research easier.

  23. John Harshman:

    I notice you didn’t answer any of the questions I asked, as is typical of you.

    Yeah, I suppose I find your input less and less valuable and interesting. Nothing personal, but Captain Kirk is going where no man has gone before.

  24. Rumraket: No I’m afraid you really do need actual evidence to substantiate why one imaginary possibility is to be believed over another.

    I am going to remember this Rumraket. 🙂

    No matter how improbable it is that an X could have arisen from a Y in a single step, it is always possible to conceive of a series of infinitesimally graded intermediates between them.

    – Richard Dawkins

    Imaginary possibilities indeed.

  25. stcordova: Nothing personal, but Captain Kirk is going where no man has gone before.

    Is he? What makes it different from any other exercise in comparative biology?

  26. Durston’s thesis committee had high praise for his work. This isn’t traditional comparative anatomy.

    Cross species comparisons have never been used to predict disordered vs. folded proteins for starters! Histone 3 is a disordered protein, yet it has near 100% conservation across animals and plants. Ok, so you’ll say the conservation is due to selection. But then the protein is disordered in 3D!

    Whereas stable folds are shown by patterns of diversity sprinkled with some conservation. The 3D fold is stable, but the sequences are variable.

    But there are chemical predictions that are not standard comparative anatomy. You’ll have to read the paper as through the lens of trying to inform 3D protein structure based on primary (sequence comparison).

  27. From wiki on disordered proteins:

    An intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure.[2][3][4] IDPs cover a spectrum of states from fully unstructured to partially structured and include random coils, (pre-)molten globules, and large multi-domain proteins connected by flexible linkers. They constitute one of the main types of protein (alongside globular, fibrous and membrane proteins).[5]

    Intrinsic disorder is particularly enriched in proteins implicated in cell signaling, transcription and chromatin remodeling functions.[14][15]

  28. stcordova:
    Durston used and algorithm found in the insurance industry, it wasn’t a phylogeny program, it doesn’t assume common ancestry except maybe for some of the alignment.It might even be better to do alignments that can drop the common ancestry assumption!

    Then I stand corrected. By failing to use the information about the phylogeny, Durston’s inference of clustering would lose considerable statistical power. We’ve been over this in the computational molecular biology literature of the past 25 years, and see that statistical inferences made from sequences that are on a phylogeny become invalid if you treat the species as if they were independent. In general the statistical significance of your conclusions is exaggerated if you don’t take the phylogeny into account.

    At least it’s possible that Durston’s sequence alignments were inferred in a way that takes phylogenies into account — many of the most popular multiple-species alignment algorithms do that by default.

  29. Joe Felsenstein:

    Durston’s inference of clustering would lose considerable statistical power.

    Thanks for comment.

    It may lose statistical power if Kirk’s goal was to resolve a phylogeny rather than find aspects of a protein 3D structure and chemical prediction.

    In any case, the proof of Kirk’s method will be future chemical tests and 3D structure prediction. No need to settle the argument today.

  30. stcordova: It may lose statistical power if Kirk’s goal was to resolve a phylogeny rather than find aspects of a protein 3D structure and chemical prediction.

    No, the statistical misbehavior (exaggeration of results) would occur for any inference about the proteins. Not just for inferring phylogenies (where it would not occur because failing to take phylogeny into account when inferring phylogenies is, well, kind of impossible by definition).

  31. Mung: No matter how improbable it is that an X could have arisen from a Y in a single step, it is always possible to conceive of a series of infinitesimally graded intermediates between them.

    – Richard Dawkins

    No one believes in that caricature of evolution. I know because Allan is always telling us that. And Rumraket. And Alan, and KN, and Jock, and Neil, and….

    In fact, if you can conceive of how its supposed to work, apparently it must be a caricature.

  32. phoodoo: No one believes in that caricature of evolution.I know because Allan is always telling us that.And Rumraket.And Alan, and KN, and Jock, and Neil, and….

    In fact, if you can conceive of how its supposed to work, apparently it must be a caricature.

    And the award for #1 missing the point goes to…

  33. Ha ha. Sal Cordova is lecturing Joe Felsenstein on significance testing in comparative biology.

  34. John Harshman:
    Ha ha. Sal Cordova is lecturing Joe Felsenstein on significance testing in comparative biology.

    So where does comparative biology predict 100% sequence conserved protein will be disordered in 3D like Kirk’s program does?

    Dismiss away, but we know who might have the last laugh.

  35. stcordova: So where does comparative biology predict 100% sequence conserved protein will be disordered in 3D like Kirk’s program does?

    Dismiss away, but we know who might have the last laugh.

    … or not.

    For that matter, where does comparative biology predict that the white-crowned sparrow does not have a white “bib” marking on its throat? Where does it predict that the American Robin will have a reddish breast?

    If we know the phylogeny and the phenotypes of relatives, we can make a rough prediction from these, and even get a rough idea of how uncertain we are about that. But of course prediction will not be precise.

    But where does creationism predict, well, anything? Sure, it predicts that elephants will be big and gray and lumber about the savannah, browsing on bushes and trees. Alas, it also predicts, with equal certainty, that elephants are small, pink, and flit from flower to flower, collecting nectar and pollinating them.

  36. To be fair, Durston’s use of a set of aligned sequences to investigate correlations, to cluster sites, to draw various conclusions, is making a vaguely valid inference. But with faulty statistics if the statistical methods used treat the sequences for each species as an independent observation.

    They aren’t independent — that is the whole point of a phylogeny. So any statistical inferences made from the independence assumption will be off, and generally will exaggerate the statistical significance of the results.

  37. phoodoo: No one believes in that caricature of evolution.I know because Allan is always telling us that.And Rumraket.And Alan, and KN, and Jock, and Neil, and….

    In fact, if you can conceive of how its supposed to work, apparently it must be a caricature.

    I’m not sure what position you are trying to ascribe to me here. I certainly haven’t found you particularly impressive in your understanding of evolution, if that’s what you mean.

  38. I recall a discussion with Durston where he seemed to be treating a group of actual proteins in organisms as a set of independent draws from the entirety of protein space. My understanding of stats is more qualitative than quantitative, but even a mathematical nonentity like myself found that debatable, if there is any genetic relationship in the data.

  39. Allan Miller:
    I recall a discussion with Durston where he seemed to be treating a group of actual proteins in organisms as a set of independent draws from the entirety of protein space. My understanding of stats is more qualitative than quantitative, but even a mathematical nonentity like myself found that debatable, if there is any genetic relationship in the data.

    There is stuff about this that doesn’t have to be that mathy.

    If a protein is near 100% conserved, it is disordered, according to Kirk’s hypothesis. This is a testable hypothesis, and something that common descent and phylogeny doesn’t predict about 3D structure based on “conservation.”

    If the 3D structure of proteins across species is the same, but not 100% conserved, but only partially conserved, and the protein is an enzyme, the “conserved” regions are catalytically significant. Now, that’s been known for a while and phylogenist will say, “conservation” is by selection. Well then, what about a 99% conserved proteins like Histone 3?

    If you look at Figure 3 in Kirk’s paper you’ll see there a proximity prediction was fulfilled where 3 residues are in 3D proximity. Where does phylogeny predict that relationship?

    https://www.ncbi.nlm.nih.gov/core/lw/2.0/html/tileshop_pmc/tileshop_pmc_inline.html?title=Click%20on%20image%20to%20zoom&p=PMC3&id=3524763_1687-4153-2012-8-3.jpg

    FWIW, this paper was Kirk’s dissertation and he had structural biologists on his committee who saw the value of what he did. He had internal review from his 3 co-authors, and external review from the journal he published in. So, notwithstanding John Harshman’s glib dismissals, other people, not just myself, believe Kirk has something to contribute.

  40. Allan Miller: My understanding of stats is more qualitative than quantitative, but even a mathematical nonentity like myself found that debatable, if there is any genetic relationship in the data.

    Wouldn’t you be a member of the mathematical nonentity set?

  41. stcordova: There is stuff about this that doesn’t have to be that mathy.

    LOL

    If a protein is near 100% conserved, it is disordered, according to Kirk’s hypothesis. This is a testable hypothesis, and something that common descent and phylogeny doesn’t predict about 3D structure based on “conservation.”

    Are you sure that this is Kirk’s hypothesis? I merely ask since it is rather obviously wrong. It appears that you, Sal, have misunderstood Kirk’s hypothesis.

    If the 3D structure of proteins across species is the same, but not 100% conserved, but only partially conserved, and the protein is an enzyme, the “conserved” regions are catalytically significant. Now, that’s been known for a while and phylogenist will say, “conservation” is by selection. Well then, what about a 99% conserved proteins like Histone 3?

    They too are under powerful selection. This is freshman biochemistry, circa 1980.

    If you look at Figure 3 in Kirk’s paper you’ll see there a proximity prediction was fulfilled where 3 residues are in 3D proximity. Where does phylogeny predict that relationship?

    Careful you don’t hurt yourself, there. That’s ubiquitin, not a histone, and AIUI it is the existence of a clustering that differs from the expected phylogeny that leads to the prediction of an interaction.

    FWIW, this paper was Kirk’s dissertation and he had structural biologists on his committee who saw the value of what he did. He had internal review from his 3 co-authors, and external review from the journal he published in. So, notwithstanding John Harshman’s glib dismissals, other people, not just myself, believe Kirk has something to contribute.

    Kirk’s work is cute, but it is not terribly novel, and most importantly it does not have the import that you think it does. Entertainingly, the interactions that Kirk predicts demonstrate that the “bit-counting” FIASCO methods of kairosfocus, gpuccio et al are completely bogus. Kirk should let them know.

  42. stcordova,

    Well then, what about a 99% conserved proteins like Histone 3?

    I recall being told about histone’s high degree of conservation back at uni in the 1970’s. It appears to be a general case that structural proteins are more highly conserved than enzymes – and of course there is a mechanistic reason for this. Much of an enzyme’s sequence is taken up with folding and orientation, or small-molecule binding, and there are huge numbers of way to skin those particular cats. Structural proteins generally interact with other proteins, or nucleic acids, and typically in a much more specific way without many degrees of freedom. Their sequence is pinned in place by their interactions.

    Where does phylogeny predict that relationship?

    I don’t know why you think phylogeny should predict it, or why anyone would care that it didn’t. What it does predict, trivially, is: if a sequence has limited latitude of amendment, then one would expect to see it amended less during evolution.

  43. Allan Miller:
    stcordova,

    I recall being told about histone’s high degree of conservation back at uni in the 1970’s. It appears to be a general case that structural proteins are more highly conserved than enzymes – and of course there is a mechanistic reason for this. Much of an enzyme’s sequence is taken up with folding and orientation, or small-molecule binding, and there are huge numbers of way to skin those particular cats. Structural proteins generally interact with other proteins, or nucleic acids, and typically in a much more specific way without many degrees of freedom. Their sequence is pinned in place by their interactions.

    I don’t know why you think phylogeny should predict it, or why anyone would care that it didn’t. What it does predict, trivially, is: if a sequence has limited latitude of amendment, then one would expect to see it amended less during evolution.

    Good answer. Your responses are so much better than John Harshman’s.

  44. DNA_Jock,

    You only think I don’t know what Kirk’s work implies. I had 3 hours of conversation with him about this since it is relevant to work on some structural biology and chemical prediction and even promiscuous domain stuff I’m working on.

    I suppose it’s fair game to insult my understanding, but not that of Kirk’s dissertation committee, his co-authors, and the journal that published the net result of his dissertation.

Leave a Reply