Critical Analysis of a Paper on Evolution by Frameshift Translation

Posted on August 31, 2018 by stcordova

Ohno’s 1984 Proceedings of the National Academy of Science paper put forward the hypothesis that frameshift mutations can create novel proteins and illustrated his claim with the supposed evolution of nylon eating bacteria. Several researchers cite Ohno’s 1984 paper favorably including Dennis Venema, Ken Miller and our very own Arlin Stoltzfus of TSZ. Unfortunately, Ohno’s 1984 hypothesis, as far as nylon eating bacteria, is dead wrong.

Here is another paper that also cited Ohno’s 1984 hypothesis favorably. This paper may or may not hold promise as it claims to have found 470 frameshift translations in the human genome.

Frequent appearance of novel protein-coding sequences by frameshift translation

Now, just going through the first few examples of framshifts in the paper, when I actually went to the NIH GenBank to look up the exmaples I got messages like this for the very first “example”

NCBI Reference Sequence: NM_207478.1 (click to see this obsolete version)

Record removed. NM_207478.1 was permanently suppressed because currently there is insufficient support for the transcript and the protein.

and then the third “example”:

NCBI Reference Sequence: NM_022085.3 (click to see this obsolete version)

Record removed. NM_022085.3: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.

and then the fourth “example”

NCBI Reference Sequence: NM_023939.3 (click to see this obsolete version)

Record removed. NM_023939.3 was permanently suppressed because currently there is support for the transcript but not for the protein.

What the heck does this imply about the claims of the paper? Does it rely on dubious data? Is it possible after the paper was published, that these errors were found and corrected in the databases, hence the paper is making claims on now obsolete premises? I mean a lot can happen in 12 years since the paper’s publication in 2006.

If these “proteins” were predicted, rather than verified proteins in the first place, then maybe nothing got frameshifted if the first place. The error is in the gene prediction, not that a real frameshift translation was actually discovered.

That’s not to say there aren’t proteins generated from alternative reading frames. There are such proteins. But this paper claims some sort of gene duplication followed by an alternative reading frame translation. That’s not completely outrageous given we have proteins coded by alternative reading frames, but I just want to try to work through what this paper is saying vs. what the data is saying.

Thanks to all in advance who decide to participate.

76 thoughts on “Critical Analysis of a Paper on Evolution by Frameshift Translation”

John Harshman on September 6, 2018 at 3:06 pm said:

colewd:
John Harshman,

Yes.

The evidence suggests otherwise. Perhaps you are not the best authority on what you understand and don’t understand.
colewd on September 6, 2018 at 5:29 pm said:

Paul C,

From where I’m sitting it’s clear you do not understand more then a fraction of it.

Can you contribute to the conversation? Do you understand the WNT pathway and its regulation?
Rumraket on September 6, 2018 at 6:26 pm said:

I think the conversation is over as you seem to just be blustering.
colewd on September 6, 2018 at 6:55 pm said:

Rumraket,

I think the conversation is over as you seem to just be blustering.

Again, I am sorry for conflating a promoter for a transcription factor protein.
Corneel on September 7, 2018 at 1:01 pm said:

colewd: Again, I am sorry for conflating a promoter for a transcription factor protein.

Is beta-catenin a trancription factor? All the sources that I looked up refer to it as a transcriptional coactivator. I don’t believe beta-catenin has any DNA-binding activity itself.

I am asking you because you are the expert whereas we obviously need to invest a lot of study time.
Rumraket on September 7, 2018 at 1:28 pm said:

colewd:
Rumraket,

Again, I am sorry for conflating a promoter for a transcription factor protein.

Bill I’m not that much concerned that you made a mistake. We all make mistakes from time to time. There’s a deeper issue with your whole approach to these discussions. You skip so many attempts to try to get you to engage substantively rather than your usual habit of merely implying and speaking vaguely.
One can ask you for clarifications on multiple points and concepts you bring up, but you simply ignore those calls for clarification to pile more unsubstantiated generalizations and vague hints on top.

So we are caught up in this endless loop of asking you for substantiations and clarifications of a huge list of claims you’ve made previously, which you respond to by just making more and namedropping some technical irrelevancy which I have to say looks unmistakably like an attempt to give your case an appearance of credibility it simply doesn’t have.
Paul C on September 7, 2018 at 2:43 pm said:

colewd: Can you contribute to the conversation? Do you understand the WNT pathway and its regulation?

I do, but I don’t see what that has to do with my pointing out how little understanding of what you are being told you have. I’m telling you the impression I’m getting from your interactions with others here. That has nothing to do with my understanding or not. As it happens I’m perfectly familiar with this topic but any contribution I could make to the conversation would most certainly not involve you. There would seem to be little point in that, you’d just ignore 99% of what I had to say.

I’m reading this thread because I’m interested in the misrepresentations made by creationists in a realm I’m interested in. Taking part would be to give you a legitimacy you do not seem to deserve, from what I’ve read from you so far. I lurk.

Any more questions?
Paul C on September 7, 2018 at 2:48 pm said:

Rumraket: So we are caught up in this endless loop of asking you for substantiations and clarifications of a huge list of claims you’ve made previously, which you respond to by just making more and namedropping some technical irrelevancy which I have to say looks unmistakably like an attempt to give your case an appearance of credibility it simply doesn’t have.

I’ve noticed a similar pattern from the poster called “Sal”.
stcordova on September 7, 2018 at 3:09 pm said:

Paul C

I’ve noticed a similar pattern from the poster called “Sal”.

You mean, me? 🙂

As far as this OP goes, there wasn’t much of a claim of anything by me. I was simply asking and looking into claims about how much of the human genome originated by frameshift.

I just realized, supposedly two proteins from two separate protein phylogenies are related by frameshift in the human genome. There is a problem here. Are the authors IMPLICITLY and UNWITTINGLY saying, that one protein/gene got frameshifted, and then it ends up in the phylogenetic tree of another protein family??????!!!!!!!!!!!!!!!!!!!!!!!!!!!

Um, I found that same problem with Ohno’s frameshift claim.

Some protein supposedly got frameshifted and then ended up in the family of Beta Lactamase homologs. In fact, I looked at the Conserved Domain Database (CDD) entry for that protein (called NylB in Flavobacteria KI72), and the bit score for that event emerging from random chance was 1 in 2^130. That wasn’t my calculation, but the NIH CDD calculation. I wrote some in the ID community and said in effect, “hey we don’t even need Doug Axe’s calculations to make the argument about beta lactamase evolution as far as Ohno’s protein, the NIH provided numbers for us.”

Sooooo, this is something to look into.

I was mostly out of pocket this week, but Sal, Inteligent Designer willing, will return to these interesting discussions.

Paul C

I’ve noticed a similar pattern from the poster called “Sal”.

As far as the frameshift discussion, starting with Ohno, I backed my claim up, and even Rumraket eventually saw the light and conceded I was right.

As far as the other stuff, you’ll have to be more specific. But the other stuff is outside of the present OP.
Entropy on September 7, 2018 at 3:17 pm said:

stcordova:
I just realized, supposedly two proteins from two separate protein phylogenies are related by frameshift in the human genome. There is a problem here. Are the authors IMPLICITLY and UNWITTINGLY saying, that one protein/gene got frameshifted, and then it ends up in the phylogenetic tree of another protein family??????!!!!!!!!!!!!!!!!!!!!!!!!!!!

That looks as if you’re mistaken about too many concepts Sal. What the hell does this even mean? I know what phylogenies are. I know what frameshift are. What I understand doesn’t make sense with what you;’re saying here.
John Harshman on September 7, 2018 at 3:42 pm said:

Entropy: That looks as if you’re mistaken about too many concepts Sal. What the hell does this even mean? I know what phylogenies are. I know what frameshift are. What I understand doesn’t make sense with what you;’re saying here.

But look at the all caps and the exclamation points. IT MUST MAKE SENSE!!!!!
stcordova on September 7, 2018 at 3:58 pm said:

John Harshman: But look at the all caps and the exclamation points. IT MUST MAKE SENSE!!!!!

If the frame shifted proteins in their list should belong to phylogenetically established protein families (like say for eukaryotes or tetrapods or mammals), that should be a problem since the frameshifts they found are supposedly unique to humans.

That should be something to look into, at least.
stcordova on September 7, 2018 at 3:59 pm said:

Wouldn’t it be funny if we found promiscuous domains in these frameshifted proteins? Something else to look into.
John Harshman on September 7, 2018 at 5:24 pm said:

stcordova: If the frame shifted proteins in their list should belong to phylogenetically established protein families (like say for eukaryotes or tetrapods or mammals), that should be a problem since the frameshifts they found are supposedly unique to humans.

That should be something to look into, at least.

First step: explain what you’re talking about. Which two proteins?
Entropy on September 7, 2018 at 6:21 pm said:

stcordova:
If the frame shifted proteins in their list should belong to phylogenetically established protein families (like say for eukaryotes or tetrapods or mammals), that should be a problem since the frameshifts they found are supposedly unique to humans.

I’m trying my best to give you the benefit of the doubt Sal, so, please do pay attention, and read for comprehension.

1. A frameshifted gene would look similar, and belong with, its homologous, non-frameshifted, genes when looking at the DNA sequences of the genes. That’s how we’d be able to establish that a frameshift happened, by checking the DNA sequences and noticing the insertion/deletion of nucleotides that thus shift the frame. We would not expect that the protein would look similar to other proteins within the frameshifted region. That region would contain different amino-acids to the original, non-shifted, corresponding regions of other proteins. However, the non-frameshifted regions would still align.

Clear so far? Please be honest, and if something is not clear due to your lack of the proper background, it’s better to try and clarify, than making a fool out of yourself.

2. Frameshifts can happen more than once. So, a gene might have a frameshift and then have a compensatory one “downstream” and get back to the original sequence. That would allow the protein to align at both sides of the frameshift. Further analyses of the DNA sequence would reveal the double frameshift.

3. Before jumping to conclusions, make sure that they said that those frameshifts were unique to humans, and that they’re talking about the whole proteins, frameshifted parts and all, belonging to already established protein families, rather than the original encoded proteins, for example. I suspect that you’re misreading at several levels. Perhaps even reading your misconceptions into the articles (an eisegesis of sorts).

ETA: 4. When something looks as ridiculous as implying that a protein from a frameshifted gene would align all right with its non-frameshifted version, then, the most probable explanation is that you read that wrong. That you’re misunderstanding something. If the authors said that two protein families are “related by a frameshift,” it most probably means that once the frameshift was noticed, the authors were able to establish that one of the protein families arose from a frameshifted version of some gene coding for a protein of the other family, not that the proteins of the families align all right with the proteins of the other family within the frameshifted regions. Clear? When something sounds outrageous, check that you read it right! Look for the most sensical meaning, and read further before jumping to outrageous conclusions. If you don’t verify the meaning you risk ridiculing your own reading comprehension abilities, and, worse, you make people question your honesty.

stcordova:
Wouldn’t it be funny if we found promiscuous domains in these frameshifted proteins? Something else to look into.

It would be funny, but I would not be too surprised if this happened. I know! Scandalous expression in your face. “What a ridiculous thing to say Entropy!” But you’re forgetting that frameshifts can happen back and forth. A frameshift could change a resulting protein sequence into something else, but, later on, a secondary frameshift might reestablish a previously lost domain. Since promiscuous domains are, ahem, promiscuous, their abundance would make them a more probable case for such an occurrence (unless we’re defining “promiscuous” differently). I’d expect that retro-frameshifts should be more likely to be found for promiscuous domains than for rare ones.

I hope that helps.

(Now I’m seeking covering to protect myself from an avalanche of misunderstanding arrogantly presented with misquotes from poorly read articles, etc, etc, and then having to tell myself, I should not have given Sal the benefit of the doubt.)
colewd on September 7, 2018 at 6:27 pm said:

Paul C,

I’m reading this thread because I’m interested in the misrepresentations made by creationists in a realm I’m interested in. Taking part would be to give you a legitimacy you do not seem to deserve, from what I’ve read from you so far. I lurk.

So far their have been no big creationists claims other then the challenge of evolving a cellular regulatory system. Their also have been no evolutionary solutions proposed here that I can see.

The evolutionists have helped with my mis characterization of beta catenin and offered proper vocabulary to use to describe the action of this protein.
Entropy on September 7, 2018 at 6:38 pm said:

John Harshman: But look at the all caps and the exclamation points. IT MUST MAKE SENSE!!!!!

That’s actually what almost convinced me.
stcordova on September 7, 2018 at 6:41 pm said:

John Harshman: First step: explain what you’re talking about. Which two proteins?

There are 470 pairs listed in Okomura’s paper for humans. 3 of the first 4 in the list had the problems I showed in the OP above, so I have issues to begin with if they are even properly predicted proteins to begin with!

So there is still some work to sort out their supplementary material which has almost 1000 entries for the human genome which they say can be paired down to 470 after duplicates are removed.

But, I brought up the point that I do think it would be funny if the SUPPOSEDLY frame shifted protein ended up in two separate protein family.

A more reasonable question, less open to debate is that there are known proteins that code from alternative reading frames, especially in phages, but some now being found in higher Eukaryotes. What if those alternatively read proteins each can fit in separate competing protein phylogenies?

That question is deviating from the OP, but I think my colleagues in the ID community would like that question to be explored.

One ID proponent, Sigfried Scherer, is doing work on alternative reading frames.
John Harshman on September 7, 2018 at 6:51 pm said:

stcordova: But, I brought up the point that I do think it would be funny if the SUPPOSEDLY frame shifted protein ended up in two separate protein family.

Ah, so this is a purely hypothetical objection based on something you merely wish would happen. Right?

Scherer isn’t “an ID proponent”; he’s a creationist. Of course you’re a creationist too. Most of the other IDers are trying to distance ID from creationism, so it’s good to see you affirming the relationship.
Entropy on September 7, 2018 at 6:53 pm said:

stcordova:
But, I brought up the point that I do think it would be funny if the SUPPOSEDLY frame shifted protein ended up in two separate protein family.

A protein deriving from a frameshifted gene would be expected to result, later on, in a separate protein family for obvious reasons. Nothing funny about this. What the hell are you thinking / assuming? Either you’re not saying something that you assume everybody to have in mind, or you really have no idea what you’re talking about.
stcordova on September 7, 2018 at 7:07 pm said:

Entropy:

A protein family deriving from a frameshifted gene would be expected to result in a separate protein family for obvious reasons.

But if that family already pre-exists, or the domains exist elsewhere, that’s a problem.

I just found one that is frame shifted and has the ankyrin domain, the p14ARF protein. This is its sequence from UNIPROT if you enter P14ARF and select human:
https://www.uniprot.org/uniprot/Q8N726

10 20 30 40 50
MVRRFLVTLR IRRACGPPRV RVFVVHIPRL TGEWAAPGAP AAVALVLMLL
60 70 80 90 100
RSQRLGQQPL PRRPGHDDGQ RPSGGAAAAP RRGAQLRRPR HSHPTRARRC
110 120 130
PGGLPGHAGG AAPGRGAAGR ARCLGPSARG PG

This is the frame shiftted CDKN2A protein:
https://www.uniprot.org/uniprot/P42771

10 20 30 40 50
MEPAAGSSME PSADWLATAA ARGRVEEVRA LLEAGALPNA PNSYGRRPIQ
60 70 80 90 100
VMMMGSARVA ELLLLHGAEP NCADPATLTR PVHDAAREGF LDTLVVLHRA
110 120 130 140 150
GARLDVRDAW GRLPVDLAEE LGHRDVARYL RAAAGGTRGS NHARIDAAEG

PSDIPD

Btw, the two proteins have different Post Translation Modification profiles. Anyone here care to argue those are just noise? How did they co-evolve on the same gene strand?

Btw, p14Arf apparently has a species phylogeny. Did its frameshift partner co-evolve along with species, along with species specific PTMs that are likely tied to development and cell phase?
Entropy on September 7, 2018 at 7:09 pm said:

stcordova: But if that family already pre-exists, or the domains exist elsewhere, that’s a problem.

I already explained that it depends:

Entropy: It would be funny, but I would not be too surprised if this happened. I know! Scandalous expression in your face. “What a ridiculous thing to say Entropy!” But you’re forgetting that frameshifts can happen back and forth. A frameshift could change a resulting protein sequence into something else, but, later on, a secondary frameshift might reestablish a previously lost domain. Since promiscuous domains are, ahem, promiscuous, their abundance would make them a more probable case for such an occurrence (unless we’re defining “promiscuous” differently). I’d expect that retro-frameshifts should be more likely to be found for promiscuous domains than for rare ones.

Please Sal, don’t rush into answering mode. First carefully consider the explanations given to you.
Entropy on September 7, 2018 at 7:14 pm said:

stcordova: Btw, p14Arf apparently has a species phylogeny. Did its frameshift partner co-evolve along with species, along with species specific PTMs that are likely tied to development and cell phase?

Do you think it impossible for a frameshift to happen in a duplicate of the original gene? If a duplicate is frameshifted, you can have parallel phylogenetic histories for each gene, each with a different protein family, yet coexisting after the frameshift happened.
colewd on September 7, 2018 at 7:22 pm said:

Rumraket,

Bill I’m not that much concerned that you made a mistake. We all make mistakes from time to time. There’s a deeper issue with your whole approach to these discussions. You skip so many attempts to try to get you to engage substantively rather than your usual habit of merely implying and speaking vaguely.

I think this is a fair criticism Rum. This was my issue as I got thrown off because I did not have the proper biological description of a protein that we were discussing.

Please understand that Sal and I are often dealing with 5 or more opposing commenters.
colewd on September 7, 2018 at 7:24 pm said:

Corneel,

Is beta-catenin a trancription factor? All the sources that I looked up refer to it as a transcriptional coactivator. I don’t believe beta-catenin has any DNA-binding activity itself.

Probably the right description as it interacts with several other proteins as transcription factors. Thanks.
stcordova on October 5, 2018 at 6:07 pm said:

I have managed to contact Dr. Kohji Okamura who was very kind and gracious to respond.

He informed me that several people are aware of the changes in the RefSeq database since 12 years after the publication of his paper and have been asking for follow on research in light of these database revisions. He encouraged me to go ahead and explore and publish appropriate revisions in light of the changing databases.

I specifically mentioned the p14ARF gene/protein in my letter to him. My encounter with p14ARF and other similar alternatively frameable proteins has convinced me of the validity of his discovery of paralogs with alternative reading frames.