Is it easy to get a new protein? A reply to Ann Gauger

In a podcast on the show, ID the Future (March 14, 2017), Dr. Ann Gauger criticized a popular argument that purports to show how easy it is to get new proteins: namely, the evolution, over a relatively short 40-year period, of nylonase. (Nylonase is an enzyme that utilizes waste chemicals derived from the manufacture of nylon, a man-made substance that was not invented until 1935.) While Dr. Gauger made some factual observations that were mostly correct, her interpretation of these observations fails to support the claim made by Intelligent Design proponents, that the odds of getting a new functional protein fold are astronomically low, and that it’s actually very, very hard for new proteins to evolve. Let’s call this claim the “Hard-to-Get-a-Protein” hypothesis (HGP for short).

To help readers see what’s wrong with Dr. Gauger’s argument, I would like to begin by pointing out that for HGP to be true, two underlying claims also need to be correct:

1. Functional sequences are RARE.
2. New functions are ISOLATED in sequence space.

In her podcast, Dr. Gauger cites the work of Dr. Douglas Axe to support claim #1, when she declares that the odds of getting a new functional protein fold are on the order of 1 in 10^77 (an assertion debunked here). Dr. Gauger says little about claim #2; nevertheless, it is vital to her argument. For even if functional sequences are rare, they may be clustered together – in which case, getting from one functional protein to the next won’t be so hard, after all.

If claims #1 and #2 are both correct, then getting new functions should not be possible by step-wise changes. Remarkably, however, this is precisely what Dr. Gauger concedes, in her podcast, as we’ll see below.

Scientific explanations for the origin of nylonase: a short history

Let me begin by providing my readers with a little background information, from a 2009 article in New Scientist magazine by Michael Le Page:

Nylon was first made in 1935. Just 40 years later, in 1975, a bacterium was discovered that is able to digest and live off not nylon itself, but waste chemicals from its manufacture – chemicals that had not existed before nylon production began.

It was later shown this bacterium, now known as Arthrobacter KI72, has evolved several types of enzymes capable of utilising these waste products. One type, 6-aminohexanoic acid hydrolase, encoded by genes called nylBs, has become known popularly as “nylonase”.

So, how did nylonase evolve? Back in 1983, a team of Japanese scientists proposed that nylonase evolved through a gene duplication and frame shift, caused by the insertion of a single base. Shortly afterwards, Dr. Susumu Ohno defended the “frame shift” hypothesis for the origin of nylonase in PNAS, in a now-famous 1984 paper.

However, a team led by Seiji Negoro of the University of Hyogo, Japan, came up with a different explanation in 2005: they claimed that nylonase actually resulted from from two point mutations in the active site of an existing carboxyl esterase enzyme. It turned out that nylonase is very similar to a common type of enzyme which breaks down natural antibiotics (known as beta-lactamases) that are produced by many different kinds of organisms. Two amino-acid changes – two mutations, in other words – were sufficient to change the beta-lactamase binding-site to a site which was capable of binding a by-product of nylon.

That makes sense. As David Wynick of the University of Bristol points out, it is generally easier to adapt an existing protein to a new function than to wait for a de novo mutation that will generate a completely new reading frame.

However, even if the frame shift hypothesis isn’t required to explain the evolution of nylonase, scientists have other well-established examples of de novo genes from alternate open reading frames. Here are two recent papers on the subject: Evolution: Dynamics of De Novo Gene Emergence (Current Biology, Volume 24, Issue 6, pR238–R240, 17 March 2014) and Concomitant emergence of the antisense protein gene of HIV-1 and of the pandemic (Proceedings of the National Academy of Sciences U.S.A., 2016 Oct 11; 113(41): 11537–11542). I wonder what Dr. Gauger would have to say about these cases.

Dr. Gauger’s puzzling admission – and the problems it generates

And now, here’s how Dr. Gauger responded to interviewer Sarah Chaffee’s question, “So, does nylonase shown that purely natural processes can create a new protein?”, in her podcast:

No. I would say, ‘Of course not,’ because we have an explanation for how it came about, through just natural modifications and selection from a starting protein – a starting enzyme, carboxyl esterase. Step-wise path, there you go. But not a frame shift. A frame shift would create a completely novel sequence, a completely novel fold, if it folded at all. And so we have no need of that hypothesis.

I note in passing that Dr. Gauger rejects an explanation for the origin of nylonase that was formerly accepted by Dr. William Dembski back in 2005, when he wrote: “Nylonase appears to have arisen from a frame-shift in another protein.” I also note that Dr. Gauger refers to “a kind of bacterium capable of digesting nylon” (2:35) and to “a whole new enzyme capable of degrading nylon” (2:44), when in fact, what the bacteria digest and live off is not nylon itself, but waste chemicals derived from its manufacture. But let us continue.

In the passage above, Dr. Gauger makes the damaging admission that natural processes can generate a new protein by a step-wise process from a starting protein, carboxyl esterase (also known as nylB-prime). That tells against claim #2 above, that new functions are ISOLATED in sequence space. For if functions really are isolated in sequence space (as Gauger, Axe and Meyers have argued previously), then how is it possible that the researchers easily found two mutations to switch nylB-prime to nylonase? That should be impossible.

In her podcast, Dr. Gauger makes much of the fact that nylonase has two open reading frames (ORFs) going in a forward direction, plus a third open reading frame going in the reverse direction. But if functional proteins are so rare, as she alleges, then how is it even possible to have overlapping ORFS? This type of bi-directional reading of information is ubiquitous in DNA, but it is much more difficult in English text. What this bi-directional reading indicates is that proteins are not so rare, after all – which undermines claim #1 above.

It should also be pointed out that the proteins described by Dr. Gauger in her podcast actually contain a high number of repeats – a point acknowledged by Dr. William Dembski back in 2005, when he described the DNA sequence of nylonase as “a very repetitive sequence.” In other words, the DNA from which this protein is generated is low-information DNA. Again: how is it possible that repetitive, low-information DNA can give rise to a new protein? Shouldn’t this be impossible? Aren’t proteins supposed to be rich in information?

Here’s another awkward question: why does nylB-prime (the protein from which nylonase evolved) appear to have no similar-looking homologs? Dr. Gauger thinks it’s because this protein is ancient. If that were the case, then we should expect to see evidence of its history in a large number of sequence-similar proteins, but we do not see this. Where did all those proteins go? What happened to them?

Finally, how is it that two genes with no homology can sometimes possess nearly the same structure and function? This shows that there are multiple sequence families that are capable of producing the same function, in the same way – which surely counts against the claim that functional sequences are RARE.

Some brief comments on Sal Cordova’s post

I mentioned above that several different kinds of enzymes have evolved in recent decades, which are capable of utilizing nylon waste products. These enzymes are known as NylA, NylB and NylC. However, none of them appear to have significant sequence homology. In a recent post, Sal Cordova puts forward a two-fold argument:

1. NylA, B, C have no homology, so they had to have had different evolutionary origins.
2. We do not know for sure if nylonase existed in nature before, because we aren’t able to go back in time.

Point #1 simply shows that evolving new proteins is easy. The fact that it has happened multiple times is not evidence against evolution, but rather, evidence that evolving new functions is a relatively straightforward matter.

Regarding #2: we know that nylon is a man-made material which does not exist in nature. It would be exceedingly strange if nylonase existed before nylon. That’s why most scientists think that these are new genes.

Finally, I’m not sure I understand the alternative model that Sal is proposing. Does he really believe that these three enzymes were hanging around for thousands of years with no function? Or does he believe that they had a hitherto-unknown function, which science will uncover?

In fact, scientists have frozen bacteria samples from before 1935 – and it isn’t even hard to obtain them. Thus it should be possible to check for the presence of nylonase in those samples. If Sal managed to identify a naturally produced nylon, then that would certainly explain why nylonase had to exist before synthetic nylon, and would suggest that these three enzymes (NylA, NylB and NylC) are not newly functional proteins, but ancient ones. However, he needs to go out and find that evidence. At the present time, scientists are not aware of any natural source of nylon.

Conclusion

Neither Dr. Gauger’s podcast nor Sal Cordova’s substantiates the HGP hypothesis. (NOTE: As Sal correctly notes below [here and here], his post was never intended as an argument for the HGP hypothesis.) For now, it appears that the evolution of new proteins in Nature is – at least sometimes – fairly easy, after all.

What do readers think?

UPDATE: John Harshman has kindly posted a plausible sequence of events for the evolution of nylonase, here, from an earlier comment on Sal Cordova’s thread. As he correctly points out, “the frame shift hypothesis, gene duplication hypothesis, and the two-mutation hypothesis are not mutually exclusive.” Harshman considers it most likely that “all three of these occurred some time in the evolution of NylB.”

89 thoughts on “Is it easy to get a new protein? A reply to Ann Gauger”

  1. RumraketRumraket

    Mung: Better according to Rumraket.

    No, better at addressing the question Axe is trying to get at.

    The experiment he did does not actually inform the question. Experiments have been done that inform the question, I document them in that large op I did. They produce entirely different numbers than Axe. They’re not lenient at all, they’re just actually addressing the question.

    Someone has repeated Axe’s work with even more lenient conditions?

    There’d be no reason to repeat his work at all since the question he’s trying to answer can’t be addressed with the particular type of experiment he does. He’s answering a different question with that experiment, then pretending he’s answering another one. But that’s good enough for ID-proponents who lack critical thinking skill on subjects relevant to ID, where their overwhelming congnitive biases take over.

  2. RumraketRumraket

    brucefast: So do you not get Axe/Gauger’s position, or are you just misrepresenting them.

    Your list of option is not exhaustive.

    I both get them and do not misrepresent them. I actually understand what they’re doing, unlike you.

    Behe’s entire “edge of evolution” thesis is that any single mutation that produces benefit is within the bounds of reasonable evolutionary possibility. He contends, if I understand correctly, that if two mutations are called for, neither on their own being beneficial nor harmful, then there is a rare outside chance of the mutations occurring (and especially in rapidly breeding organisms like bacteria.)

    Nobody cares about Behe in the context of addressing Axe’s claim. You can’t just start bringing in completely unrelated claims.

    And you don’t understand him. What you just said is not what Behe says. What Behe says is that if two mutations have to happen simultaneously to be beneficial, since each of the ones would be deleterious on their own, then this is the “edge” of evolution.

    Now, as far as “genes with new functions happen all the time with as little as 1 or 2 mutations” I think you exaggerate your point.

    Then you’re wrong. Manifestly, provably, wrong. Wrong because it happened in the real world, 140 million years ago, by a frameshift mutation. A single mutation.

    A single mutation created the very type of enzyme that Axe believes is so rare in protein sequence space that it is virtually impossible to find it by “chance”. A beta-lactamase.

    What’s even more interesting is that, that beta-lactamase has a chemically unrelated function nearby in sequence space. It is a carboxylic ester hydrolase, but it can also catalyze amide hydrolysis in nylon waste oligomers.

    And it takes merely 2 amino acid substitutions to increase the rate of amide hydrolysis 200-fold.

    While genes with new abilities happen with 1 or 2 mutations, but the abilities are what you might call neighbor abilities. In the example of nylonase being discussed in this post, an enzyme that digested other stuff now could digest nylon.

    It’s not a “neighbor” ability, that’s just some stupid term you made up on the spot. Your term there is nothing but a sort of rhetorical devise you bring up to try to diminish the concrete real-world effect of the evidence before you.

    You don’t get a gene that produces a protein that is a muscle (retracts under electrical signal) to suddenly digest something.

    Why not? What the hell do you actually know about that? Nothing. You just made that up too. You’re more transparant than intergalactic vacuum in the Eridanus Supervoid is to neutrino radiation.

    Put it into more human terms, if you have a screw that doesn’t work for a certain situation because it is too large, it is quite within evolutionary capacity to mutate a smaller screw that does work. However, try to mutate, oh, a soap, a surfactant, from the same screw and, well, you’re screwed.

    Well it’s nice that you can sit there and halluscinate up these nice analogies that’s supposed to show how things that happened in the real world can’t take place.

    The very type of transformation you insist can’t happen (transformation from an enzyme into a complex structural protein with an entirely unrelated function), did:
    Evolution of an ancient protein function involved in organized multicellularity in animals.

    To form and maintain organized tissues, cells must coordinate how they divide relative to the position of their neighbours. One important aspect of this process is orientation of the mitotic spindle, a structure inside the dividing cell that distributes the chromosomes —and the genetic material they carry — between the daughter cells. When the spindle is not oriented properly, malformed tissues and cancer can result. In a diverse range of animals, the orientation of the spindle is controlled by an ancient scaffolding protein that links the spindle to “marker” proteins on the edge of the cell.

    Anderson et al. have now used a technique called ancestral protein reconstruction to investigate how this molecular complex evolved its ability to position the spindle. First, the amino acid sequences of the scaffolding protein’s ancient progenitors, which existed before the origin of the most primitive animals on Earth, were determined. Anderson et al. did this by computationally retracing the evolution of large numbers of present-day scaffolding protein sequences down the tree of life, into the deep past. Living cells were then made to produce the ancient proteins, allowing their properties to be experimentally examined.

    By experimentally dissecting successive ancestral versions of the scaffolding protein, Anderson et al. deduced how the molecular complex that it anchors came to control spindle orientation. This new ability evolved by a number of “molecular exploitation” events, which repurposed parts of the protein for new roles. The progenitor of the scaffolding protein was actually an enzyme, but the evolution of its spindle-orienting ability can be recapitulated by introducing a single amino acid change that happened many hundreds of millions of years ago.

    How could a single mutation have conferred such a dramatically new function? Anderson et al. found that the ancient scaffolding protein uses the same part of its surface to bind to the spindle-orienting molecular marker as the ancient enzyme used to bind to its target substrate molecule, and the two partner molecules happen to share certain key chemical properties. This fortuitous resemblance between two unrelated molecules thus set the stage for the simple evolution of a function that is now essential to the complexity of multicellular animals

    (…)

    We found that Anc-gkdup is an active guanylate kinase enzyme, with a Michaelis constant (KM) comparable to that of the human enzyme, albeit with a slower kcat (Figure 2A). It displays no measurable Pins binding and failed to orient the mitotic spindle in living cells (Figure 2B–E). These data indicate that enzyme activity is, as predicted, the ancestral function of the family; further, the scaffolding functions associated with spindle orientation were not yet present, even in suboptimal form, when duplication of the gk enzyme gene gave rise to the locus leading to GKPIDs.

    By the time of the Filozoan ancestor, however, the evolving GKPID had lost the ancestral enzyme activity entirely and gained de novo spindle-orienting functions. Specifically, we found that Anc-GK1PID has no detectable guanylate kinase activity, but it binds Pins with moderate affinity and is highly effective in orienting the mitotic spindle in cell culture (Figure 2A,B,F). We also reconstructed Anc-GK2PID – the more recent progenitor of all Dlg proteins in the ancestral animal – and found that it too orients the mitotic spindle and binds Pins with even higher affinity, suggesting a subsequent fine-tuning of Pins-binding capacity (Figure 2—figure supplement 1).

    (…)

    Of the amino acid changes that occurred in these regions of the protein during the interval between Anc-gkdup and Anc-GK1PID, only five are conserved among descendant GKPIDs (Figure 5B,C). To test these substitutions’ functional importance, we introduced the derived states individually into Anc-gkdup and characterized their effects on guanylate kinase activity, Pins binding, and spindle orientation. Remarkably, we found that either of two amino acid changes in the hinge is sufficient to confer the protein-binding function. Substitution s36P, located where the hinge meets the binding lobe, virtually abolished the catalytic activity of Anc-gkdup and established moderate-affinity Pins binding (Figure 6A,B; lower and upper case residue symbols denote ancestral and derived states, respectively).

    This can’t happen. That can’t happen. It’s too unlikely, a single mutation will not be enough, you can’t convert a screw into soap bla bla bla I sit here and declare with staunch convinction how the world just can’t work because… well because it just can’t, okay?

    But it does.

  3. RumraketRumraket

    colewd: Hi William

    All folded proteins are temperature-sensitive. That is, as temperature is increased, they all unfold at some point (called the melting temperature). The melting temperature is a measure of how stable the folded structure is, more stable structures withstanding higher temperatures.

    I set out to measure how restrictive the sequence constraints are for achieving a folded protein structure. Before I started, I had to decide whether I wanted to measure this for a normal “wild-type” structure or for a marginally stable structure.

    What Art Hunt didn’t get, and everyone else who has relied on his confused analysis of my paper doesn’t get, is that THE CONSTRAINTS ARE HIGHER FOR MORE STABLE STRUCTURES (not meaning to yell here, but this is the simple point that I’ve repeated more times than I care to count).

    If I had done my experiments with the wild-type enzyme, I would have used a much higher ampicillin concentration and a higher temperature (so that the wild-type enzyme could barely pass the test). That way I would be requiring mutants to be as good as the wild-type. If I had done this, the fraction of mutants that pass would have been much lower, and in that case I’m sure the evolutionists would have complained that it was unfair of me to expect a newly evolved enzyme to be as stable as a highly perfected wild-type enzyme.

    So I didn’t do that.

    Instead I started with a marginally stable enzyme and measured the fraction of mutants with equivalent marginal stability. This resulted in a substantially higher fraction of passing mutants, but as you know, the fraction still ends up being extremely low. It stands to reason that it would have been far lower had I done it the other way.

    This is absolute fucking obfuscation on the part of Axe. The problem here is that Axe is conflating what he is trying to conclude with his experiment, with what his experimental results actually show.

    Here Axe pretends that he’s only trying to elucidate “how restrictive the sequence constraints are for achieving a folded protein structure.”

    But what Axe concludes from his experiment is that “Thus, the probability of finding a functional protein among the possible amino acid sequences corresponding to a 150-residue protein is similarly 1 in 10^77.”

    The exact words from his 2004 paper are: “Combined with the estimated prevalence of plausible hydropathic patterns (for any fold) and of relevant folds for particular functions, this implies the overall prevalence of sequences performing a specific function by any domain-sized fold may be as low as 1 in 10(77)”.

    This is what ID-creationists cite Axe for, that very specific and particular claim about the prevalence of functional folds in protein sequence space. This is what Meyer says in Darwin’s Doubt that Axe demonstrated, it’s what they push on EN&W.

    In fact, they write:

    This paper is interesting because it relates to the work of Douglas Axe that resulted in a paper in the Journal of Molecular Biology in 2004. Axe answered questions about this paper earlier this year, and also mentioned it in his recent book Undeniable (p. 54). In the paper, Axe estimated the prevalence of sequences that could fold into a functional shape by random combinations. It was already known that the functional space was a small fraction of sequence space, but Axe put a number on it based on his experience with random changes to an enzyme. He estimated that one in 10^74 sequences of 150 amino acids could fold and thereby perform some function — any function.

    Yet here above, when an aspect of Axe’s work is criticized, Axe now alters his claim. Now he’s no longer trying to estimate “the overall prevalence of sequences performing a specific function by any domain-sized fold”, he’s now addressing “how restrictive the sequence constraints are for achieving a folded protein structure.”

    That’s an entirely different thing. How many functional folds there are, and how closely they are together, is not at all addressed by trying to determine how tolerant some particular fold is to mutation. What’s worse, if you’re trying to determine how tolerant some particular fold is to mutation, setting the deck so the fold you test is made DELIBERATELY fragile is to further stack the deck.

    Christ, I forgot how utterly fucking disingenuous these pieces of shit are. What a fucking fraud.

    Bill, Axe doesn’t have the truth as his interest. He’s after your money and he doesn’t give a fuck. Stop drinking his coolaid. Stop buying his obfuscatory smokescreens. You quoted it, but clearly you didn’t understand it. Otherwise, I hope for you sanity’s sake, you wouldn’t have quoted it. Stop brainlessly peddling these people’s lies, they don’t have your best interest at heart.

  4. Allan Miller

    Alternative splicing. How does that work? No, I don’t mean look at the ‘wonderfully complex Rube Goldberg spliceosome’, I mean think – really think – about the implications of the fact that inclusion or omission of entire peptide segments fails to exit viable protein space. The the extent that it happens and has biological significance, the isoforms are crucially functional.

  5. MungMung

    Rumraket: He’s answering a different question with that experiment, then pretending he’s answering another one.

    So he is at least an IDer doing science. That’s progress. Right?

    What question does his experiment answer, and how did it get past peer review?

  6. stcordova

    VJ Torley’s edit:

    it appears that the evolution of new proteins in Nature is – at least sometimes – fairly easy, after all.

    YES!

    Easy-to-evolve proteins are easy to evolve. The issue is if there are hard-to-evolve proteins from pre-existing proteins, or better yet, integrated systems involving several new or radically changed proteins.

    Still a lot for all parties to discover. Many open questions.

    If I had been Axe, I might have focused on Eukaryotic-only proteins involved in transcription/translation evolving from prokaryotic pre-cursors.

  7. colewd

    Allan Miller,

    Alternative splicing. How does that work? No, I don’t mean look at the ‘wonderfully complex Rube Goldberg spliceosome’, I mean think – really think – about the implications of the fact that inclusion or omission of entire peptide segments fails to exit viable protein space. The the extent that it happens and has biological significance, the isoforms are crucially functional.

    Exactly. How did we go from a world of uncomplicated DNA to chromosomes? How did we go from open DNA to a nucleus with gates that control the movement of molecules in and out of the nucleus? How did we go to DNA with mostly coding regions to introns that need to be spliced out? At the same time a Rube Goldberg spliceosome appears to solve a major problem these darn introns create.

  8. RumraketRumraket

    Mung: What question does his experiment answer, and how did it get past peer review?

    It addresses the question: “How rare is this particular function (carboxylic ester hydrolysis of ampicillin) for this particular fold (engineered temperature sensitive variant of beta-lactamase), under these particular conditions (temperature, pressure, pH and so on), within the immediately surrounding sequence-space?”.

    That is the actual question that the experiment addresses. That’s the question you can empirically and logically address by doing what he did. No more, no less.

    As to how it got past peer review, it doesn’t matter. Not everything that goes through peer review is unassailable fact. As no doubt all ID would have to agree, otherwise all ID proponents would have to contend with the millions of publications that say naturalistic evolution is a fact and lots of other published things they disagree with. Going down that “it went through peer review” road is not going to do you any favors.

  9. stcordova

    FWIW,

    Since the OP references my discussion on Nylonases, here was an DRAFT abstract for a paper I was considering circulating in whatever venue would accept it.

    Challenges in tracing the flash evolution of Nylonases NylA, NylB, NylC

    In vitro experiments have demonstrated the ability of mutation and selection to evolve nylon digestion in bacterial strains like Pseudomonas in a matter of 3 months from strains that previously lacked that ability. The word “flash” has been used to refer to such rapid evolution of function relative to geological time. Experiments in evolving nylon digestion in Pseudomonas as well Lenski’s LTEE experiments in evolving citrate mutants in E. Coli are gold standard protocols for tracing the steps to evolve different functions in bacteria since the ancestral state of the bacterial lines before evolution are readily available to test evolutionary hypotheses. In contrast, evolution of nylon digestion in the wild for other bacterial lines like Arthrobacteria (formerly called Flavobacteria sp., formerly Acromobacter gattatus) cannot be as clearly characterized given data is almost non-existent for characterizing the ancestral state of the genes that evolved NylA, NylB and NylC nylonases in Arthrobacteria.

    Nylon was presented to the world in 1935, and in 1977 a report was published that identified the first bacteria known to digest nylon. At the time it was called Acromobacter gattatus. For 40 years since that time, many hypotheses have been put forward to explain the emergence of nylonase capability in the Arthrobacterial line. This review evaluates 40 years worth of various hypotheses of the evolution of Nylonases NylA, NylB, NylC in Arthrobacteria, the challenges in tracing the evolutionary steps, and questions that remain open to this day. In this review, Ohno’s frameshift hypothesis is critically examined, as well as various gene duplicaiton scenarios. Additionally, problems in constructing manual and computationally derived phylogenies for Arthrobacteria nylonases are examined. Potential future discoveries are suggested which may resolve present ambiguities in constructing the flash and/or long-term evolutionary history of these Arthrobacteria nylonases.

    Comments are welcome. I was coy about the main issue is that as far as nylonase evolution, namely, we don’t have a lot of adequate data to really make clear statements like say relative the LTEE. More research is needed.

  10. RumraketRumraket

    Mung: So he is at least an IDer doing science. That’s progress. Right?

    I don’t believe he is. I think he’s a charlatan and a liar, and he’s deliberately setting up experiments that don’t lend themselves to the conclusions he draws from them and which he advertises to an ignorant and gullible ID crowd. He’s an employed member of a profoundly dishonest propaganda machine and I have no respect for him and his ilk.

  11. RumraketRumraket

    stcordova:
    FWIW,

    Since the OP references my discussion on Nylonases, here was an DRAFT abstract for a paper I was considering circulating in whatever venue would accept it.

    Comments are welcome.I was coy about the main issue is that as far as nylonase evolution, we don’t have a lot of adequate data to really make clear statements. More research is needed.

    Well I think one of the problems is that most of the evolution of the key nylonase (the one we actually know something about) NylB (and it’s homologues P-NylB and F-NylB’), took place long before it became what we know of as nylonase.

    As John Harshman pointed out in the other thread, the various scenarios are not mutually exclusive. And given the large sequence-divergence between F-NylB and P-NylB, this implies they evolved from a common ancestor some 140 million years ago.

    It was Ohno that first discovered the evidence for the frameshift mutation, but I think a mistake that has been circulating in the interpretation of that evidence (one that I made myself) is that it happened recently, after the invention of nylon manufacture.
    Instead, what the evidence implies is that the frameshift mutation happened some time prior but relatively close to those 140 million years ago when HGT of the pOAD2 plasmid put it in both Flavobacterium and Pseudomonas, or independently in both the Flavobacterium and Pseudomonas populations in identical locations. But of those two, the single frameshift mutation hypothesis is the more likely of course.
    The story goes something like this:
    140 million years ago, an ancestor of flavobacterium carried a plasmid with a gene encoding a protein of an unknown function, 472 amino acids long. A frameshift mutation created an alternate reading frame in this gene, resulting in a 392 amino acid protein, also of unknown function. This protein turned out to be beneficial to the organism, so it was retained.

    The plasmid carrying this protein was horizontally transferred to the Pseudomonas bacterium (or from Pseudomonas to Flavobacterium, which one had the original isn’t known and doesn’t actually matter). Then the “proto”-NylB gene diverged independently in both organisms. It was still retained in both because it was apparently useful, but it also mutated a lot independently in both lineages, so much that eventually it’s less than 40% similar in amino acid sequence today.

    Eventually over a hundred million years after the horizontal transfer, the proto-NylB gene in Flavobacterium gets duplicated so there is “proto”-NylB and “proto”-NylB’.
    Millions of years more pass, during which time proto-NylB and proto-NylB’ diverge by about 12% in amino acid sequence (that’s about 47 amino acids out of the 392 total).

    Eventually Flavobacterium finds itself in a waste-water pipe of the worlds first nylon factory in 1935. Coincidentally, the F-NylB enzyme, which is 88% similar to F-NylB’, can catalyze the breakdown of the nylon waste. Between 1935 and 1975, F-NylB takes on two additional amino acid substitutions, so it is now 200 times better at catalyzing nylon waste breakdown than it was in 1935. F-NylB’ is still 200 times worse at catalyzing nylon waste breakdown compared to F-NylB, which implies it is retained to act on the “ancestral” function to nylon-metabolism.

    Between 1984 and 1992, scientists work out the above history for the NylB enzymes and it’s homologues. They still don’t know what the hell NylB did before nylon evolved. All they know is it must have been useful to have been retained all the way from it’s origin in a frameshift mutation 140 million years ago, otherwise it should have been lost to the accumulation of deletion mutations.

    What was weird about it was that NylB belongs to a particular class of enzymes called amide hydrolases. That is the particular class of chemical reactions it catalyzes. The hydrolysis of amide-bonds.
    Naturally, scientists reasoned it must have evolved from other amide-hydrolases. They tested it on literally hundreds of natural amide compounds back in the 1990’s, but it worked on none of them. Not even a little. It ONLY worked on the amide bond in nylon as far as they could gather.

    Eventually, 2005 comes along and some scientists want to find out what the hell kind of enzyme NylB evolved from. So they work to resolve the structure of the enzyme to see if maybe they can determine from it’s structure what it used to do, before nylon was invented by humans. They find out it’s a Beta-lactamase. It catalyzes the break-down of ampicillin (a beta-lactam) and they confirm this by testing it on various compounds unrelated to amide hydrolysis. Particularly a type of chemical reaction called carboxylic ester hydrolysis. Which means it hydrolyses ester-bonds.

    I made a timeline of the enzyme’s evolution as I understand it now. Corrections welcome if I’ve got something wrong:

  12. keithskeiths

    I made a timeline of the enzyme’s evolution as I understand it now. Corrections welcome if I’ve got something wrong:

    Those resistors are going to create an undesirable voltage difference between Proto P-NylB and Proto F-NylB and their successors. Otherwise it looks good. 🙂

    Seriously, thanks for the diagram.

  13. RumraketRumraket

    And I’m also interested to know whether anything is known about the function today, if any, of the original 472 residue protein within which the NylB reading frame is situated.

    Is it some times expressed? Is it expressed in both Arthrobacter and Pseudomonas still or is it long-dead ever since that original frameshift mutation?

  14. Allan Miller

    colewd,

    Exactly. How did we go from a world of uncomplicated DNA to chromosomes? How did we go from open DNA to a nucleus with gates that control the movement of molecules in and out of the nucleus? How did we go to DNA with mostly coding regions to introns that need to be spliced out?

    Exactly nothing. Take a point about the remarkable plasticity of proteins as directly demonstrated in living systems and use that as your starting point to talk about something else entirely

    At the same time a Rube Goldberg spliceosome appears to solve a major problem these darn introns create.

    Wrong way round. An intron has no meaning without splicing. It’s the existence of splicing apparatus that causes introns. Including those described as ‘self-splicing’, which are obviously both.

    [eta – and, essentially, despite an explicit request to not get distracted by ‘the wonderful Rube Goldberg spliceosome’, the moth can only say ‘but it’s so preddy’ … fffft!]

  15. colewd

    Allan Miller,

    Wrong way round. An intron has no meaning without splicing. It’s the existence of splicing apparatus that causes introns. Including those described as ‘self-splicing’, which are obviously both.

    So you are not a supporter of the type two intron story?

  16. stcordova

    It was Ohno that first discovered the evidence for the frameshift mutation,

    I think Ohno was wrong and that seems to be the opinion of the actual crew who actually discovered nylonases and have been publishing on them for the last 40 years. The non-frame shift theorist seem to have the better theory.

    Royal Truman I think had a good critique of Ohno. The first problem is the homologues of the supposed ancestral sequence doesn’t seem to exist anywhere.

    http://creation.com/images/pdfs/tj/j29_2/j29_2_78-85.pdf

    Three decades ago Ohno proposed genes arose from simple oligomer repeats. He claimed that a ‘T’ insertion into gene PR.C consisting of multiple nt30 (CGACGCCGCT)3 produced the ancestor of a new enzyme, E-II, which hydrolyzes Ald, a synthetic material of recent origin. However, our Blastp searches against the NCBI database found no proteins related to the product from PR.C. Using EMBOSS Needle it was not possible to align nt30 unambiguously to any of forty sliding windows 40 nt long from positions 1 through 79 in PR.C. These nt30 alignments were no better than thirty randomly generated sequences also 40 nt long having similar nucleotide distribution as PR.C. The optimal number of nt30 which align over the entire PR.C did not match as well as a simple chain of pure CG dinucleotides. Furthermore, the polypeptide chain derived from nt30 did not align with the PR.C’s protein as well as a chain of pure arginines did. These and other tests reveal that the high C+G content of PR.C leads to statistical artefacts and no support for Ohno’s frame-shift theory.

  17. colewd

    Allan Miller,

    Exactly nothing. Take a point about the remarkable plasticity of proteins as directly demonstrated in living systems and use that as your starting point to talk about something else entirely …

    Are you making the argument that the “remarkable plasticity” feature helps support the random origin hypothesis including the multi layered sequencing feature of translation and alternative splicing?

  18. RumraketRumraket

    stcordova: I think Ohno was wrong and that seems to be the opinion of the actual crew who actually discovered nylonases and have been publishing on them for the last 40 years.

    I can’t seem to extract that conclusion from anywhere. I think you are somehow thinking that, because they suggest the nylon oligomer hydrolase EII evolved from the common ancestor of EII and EII’, this means they are simultaneously saying no frameshift mutation ever happened. But they don’t say that in the 2005 paper at least.
    And even if they did, that would just make for a strange conclusion because again, the two scenarios are not mutually exclusive. So I don’t see why anyone would conclude that.

    Which brings us to the point about the actual evidence for the frameshift mutation.

    The non-frame shift theorist seem to have the better theory.

    They can’t have a “better” theory for the origin of NylB, when they have no theory for the origin of NylB. To them, NylB (or more correctly, a highly similar ancestor) effectively “just exists”. They basically just start from there and say then it got duplicated, the duplicates diverged a bit and then NylB adapted towards nylon oligomer hydrolysis with two amino acid substitutions.

    Royal Truman I think had a good critique of Ohno. The first problem is the homologues of the supposed ancestral sequence doesn’t seem to exist anywhere.

    Right, and that would be some evidence that the gene is not being expressed. Not that it never existed and a frameshift mutation didn’t take place.

    After all, if there really IS a longer reading frame bracketing the 392 residue NylB reading frame, downstream of enhancer and promoter regions, in close vicinity to a Shine-Delgarno sequence, and if in fact a single Thymine deletion within this longer reading frame, creates a premature stop codon and an alternative start codon, how is this not evidence that there once used to be another functional protein sequence specified there?

  19. RumraketRumraket

    Btw, now re-reading Ohno 1984 I discover the putative original reading frame is only supposed to be 427, not 472 amino acids long. In any case, he shows that both reading frames have a Shine-Dalgarno similar sequence in front of them.

    Ohno does suggest that the frameshift mutation happened recently, and independently in both NylB and NylB’, after 1935. I think it is that scenario which you are saying Negoro et al 2005 is rejecting? If so I would agree that seems less plausible.

  20. Tom EnglishTom English

    Rumraket: Christ, I forgot how utterly fucking disingenuous these pieces of shit are. What a fucking fraud.

    Having read the new book by Marks, Dembski, and Ewert, I’ve been thinking much the same. These people are not just wrong. They’re fucking psychopaths, on par with televangelists.

  21. Allan Miller

    colewd,

    So you are not a supporter of the type two intron story?

    For fucks’s sake! Again, I’m saying look at what alternative splicing tells us about protein space!

  22. Allan Miller

    colewd,

    Are you making the argument that the “remarkable plasticity” feature helps support the random origin hypothesis including the multi layered sequencing feature of translation and alternative splicing?

    No, I’m making the argument that functional isoforms directly contradict, with live data, the supposition that any given protein sequence is surrounded by non-function – is brittle to any and all amendment, and hence is non-evolvable. The fact that you can reach different function by module shuffling or inclusion/omission demonstrates this. And whatever you can do with alternative splicing, evolution can do with transposition (by which I DO NOT mean transposons) and indels.

    Evolution is not restricted to point mutation.

  23. Allan Miller

    stcordova,

    I’m quite prepared to believe that alternative splicing is largely an artifact. But equally, when one argues that it is widespread, important and pervasive, one is simultaneously (unknowingly, of course), undermining the ‘islands of function’ brigade. Such inconsistencies part of daily routine for the average Creationist, of course – one more certainly can’t hurt.

  24. johnnyb

    Allan Miller,

    Allan Miller:
    stcordova,
    I’m quite prepared to believe that alternative splicing is largely an artifact. But equally, when one argues that it is widespread, important and pervasive, one is simultaneously (unknowingly, of course), undermining the ‘islands of function’ brigade. Such inconsistencies part of daily routine for the average Creationist, of course – one more certainly can’t hurt.

    Not really. It makes hte “island” a bit larger, but with steeper sides. Additionally, it is like having an island with bridges going over all of the ponds in the middle.

    I am not as well-versed in alternative splicing as I should be, but for another example with similar properties that I am aware of, think about VDJ recombination. Here, you have a whole series of gene parts that are roughly interchangeable. However, managing that interchangeability is quite difficult. You have to have recombination signals in precisely the correct places spread throughout the chromosome. If the recombination signals are misaligned, then the gene parts don’t work. Additionally, only the variable region can be manipulated in this way. The constant region (i.e., the part involved in signaling the rest of the immune system) is not open to this same level of variance. There are more than one possible constant regions, but they are not near in sequence space like the variable region. The genome is coded for that, and so, when it mutates to generate new antibodies, it specifically does not mutate the constant region of the gene.

    So, in short, while this makes for a quite large island of function, it also puts a very large moat around it. It requires that, for a very long region of the chromosome, that recombination signals be very reliably placed, that non-coding DNA be very reliably placed for mutation in the right sector, for specific constant regions in space to match, for the RAG1/RAG2 proteins to match the recombination signals, etc.

    I don’t think anyone doubts that there are larger and smaller islands of function. But that doesn’t meant that they aren’t islands. Anytime you have function-generators (VDJ recombination, alternative splicing, etc.), it requires a stricter, tighter control on the function generation to prevent catastrophic loss. In fact, a major cause of cancer is the generation of false recombination signals.

    A great place to start on the general idea for this is Dembski’s Searching Large Spaces. If you want to see more specifically what creates islands of function, you should see my paper on what makes irreducibly complex things irreducibly complex.

  25. Allan Miller

    johnnyb,

    Not really. It makes hte “island” a bit larger, but with steeper sides. Additionally, it is like having an island with bridges going over all of the ponds in the middle.

    It depends critically upon whether the isoforms have the same or different functions. Since there is not really a satisfactory operational definition of function, this is hard to establish to a satisfactory degree. But basically, anything alternative splicing can do, evolution can do. You can scuttle round with your bridges and moats, desperately trying to circumscribe the region of accessibility, but alternative splicing (to the extent it happens and is of functional significance) gives the lie to the ‘brittle island’ model of naive Creationist supposition, coming as it does from people with notions overly informed by English sentences or the ASCII sequences of computer programs.

  26. Allan Miller

    The challenge: what is it about proteins, given actual mechanisms of change, that persuades us they cannot access other viable neighbourhoods? The answer does not depend upon the persuasiveness of the imagery in one’s analogy.

  27. Allan Miller

    The simple islands argument:

    For any sequence A there is no functional neighbour B accessible by any viable transformation series from A.

    The sophisticated islands argument:

    For any sequence A where there is a B, there is no functional neighbour C …

    I suspect it’s my job to prove there is a C. I wonder what would come next?

  28. johnnyb

    I give a reason for this from computability theory in my paper on irreducible complexity. I show (a) what makes something “difficult”, (b) how this can be mitigated but only by having a source of information, (c) specific biological examples, and (d) using this to find known-designed sequences within AVIDA.

  29. Allan Miller

    johnnyb,

    I don’t quite see the relevance of irreducible complexity to the explorability of protein space. This depends upon mechanisms of transformation, local functional density, and paths of viability, not on any interdependence of parts.

    To relate back to my starting position, is any isoform more or less irreducibly complex than any other?

  30. RumraketRumraket

    johnnyb: I give a reason for this from computability theory in my paper on irreducible complexity. I show (a) what makes something “difficult”, (b) how this can be mitigated but only by having a source of information, (c) specific biological examples, and (d) using this to find known-designed sequences within AVIDA.

    Bla bla bla obfuscatory gobbledygook with no relation to reality.

Leave a Reply