A prominent ID supporter at UD, gpuccio, has this to say:
My simple point is: reasoning in terms of design, intention and plans is a true science promoter which can help give new perspective to our approach to biology. Questions simply change. The question is no more:
how did this sequence evolve by some non existent neo darwinian mechanism giving reproductive advantage?
but rather:
why was this functional information introduced at this stage? what is the plan? what functions (even completely unrelated to sheer survival and reproduction) are being engineered here?
Gpuccio references actual biology in his writings and is one of the few at UD that do, and as such I’m prepared to take him at his word that the ID project is now ready to move from simply determining design to answering the questions he posed:
- why was this functional information introduced at this stage?
- what is the plan?
- what functions (even completely unrelated to sheer survival and reproduction) are being engineered here?
If any ID supporter would like to provide a specific example with answers for those 3 points for discussion that would be perfect.
Gpuccio’s OP concludes:
The transition to vertebrates was a highly engineered process. The necessary functional information was added by design.
In response I simply repeat back the question what is the plan?
I guess, but my silent ban coinciding with gpucci’s immediate absence after talking to him for a while was a bit too much of a coincidence for me.
Oh well.
Rumraket,
I am with you so far 🙂
Could you also answer the questions I posed?
What does it mean to “mathematically reconcile” how a de novo gene is created?
Try to explain yourself:
What are you looking for?
What should be in the paper in your opinion?
Why should that be in the paper?
Poochie:
He couldn’t tell that colewd wrote the stupid part? Not surprising that he’d be impressed only by the ignorant junk.
Glen Davidson
Just want to add that I found that Nature Genetics review of orphan genes to be excellent. It answers many of colewd’s questions, or would if he would only read it.
Rumraket,
The papers that I have read give two proposals on de novo genes. 1. Gene duplication 2. NC RNA. In one paper there is a vague proposal on once a gene is duplicated and finds new function, how it then gets transcribed. The real challenge is finding new function through almost infinite mathematical space. First a mechanism of change needs to be identified. Then a mathematical model that can repeatably get you from functional space A to functional space B which is the new gene as a result of gene duplication. The additional challenge here is natural selection does not help until function is found and successfully transcribed.
While Lenski’s experiment showed through gene duplication and transcription of a duplicated gene how a new feature could evolve (the ability to consume citrate in an aerobic condition) how did the original 480AA enzyme sequence that breaks down the citrate molecule evolve?
colewd,
Paging Joe G. Cleanup on aisle ω0.
Can’t say I believe it was gp’s ideal, but it sucks being right all the time.
It’s useless to post at UD, because sanity always gets you banned.
Only if you think there is just one needle in the haystack. There is no reason to think proteins with function are rare. “Search” only has to find a functional protein, not the functional protein.
ROTFLMFAO!
If you are genuinely interested in understanding this topic, you should start by reading Wagner’s Arrival of the Fittest.
First job for anyone taking colewd’s argument seriously is to find a coding sequence in nature that has no alleles. No variant forms within a species, and none in the lineage.
colewd, on February 13th:
What happened, colewd? Did you get the rest of the way through Wagner’s book and decide that you didn’t want to discuss it after all?
Given the entirety of protein configuration space, what proportion of it is functional? If you can’t answer, how do you know it’s a problem?
And for bonus points, what ‘function’ are you talking about specifically?
Forgot Wagner”s book was subject of a thread here. Some good comments.
I’ve had the function/search space conversation with a few people, a number of times on UD.
Here’s my latest thoughts, as simply as I can put them, feel free to poke holes or tell me your thoughts:
-Protein function is largely dependent on tertiary structure
-Tertiary structure is largely built by organizations of alpha-helices and beta-sheets
-These secondary structures are easily formed by the majority of sequences, save use of oddballs like proline
-The exception is the active site themselves, which is primary structure, but relatively few amino acids directly involved here
This means that it is a very select number of amino acids that would need to be found to produce low-level function (early in a protein’s history), while evolution can than optimize the rest of the protein (middle of a protein’s history), which would then be conserved (late in a protein’s history and the majority of what we see today in proteins across species)
Alan Fox,
The problem with this argument is that many proteins need to work together in both shape and charge. Small mutational change can cause failure especially with nuclear proteins. Sequence specificity is critical for living organisms functions like cell division and DNA repair.
It is not just if you think of “one needle in the haystack” because the haystack in almost infinitely large so even if you have millions of needles in the haystack you will never find one.
If we look at the bacterial flagellum motor the first protein may have several sequences that work but as you add proteins the next one has to fit with the first one in both shape and charge. When all 40 proteins fit together they need to turn a rotor at very high speed. These proteins need to fit together to perform a specific function “mobility” that allow selective advantage to the bacteria.
After the first protein the following proteins sequences need to be very specific. The flagellum takes up 100 nucleotides in the bacteria DNA. There are 4^100000 possible ways to arrange this. This number represents a haystack that is orders of magnitude larger than our universe.
What paper is that? Surely it is already transcribed, otherwise how the hell did it find a new function?
Why is that a challenge? How far is it from function to function, does the entirety, or a colossal part of the space really need to be searched before a new function is found?
What experimental work has been done to assess that and why aren’t you satisfied with the results?
That would be mutation.
How do genes change? They are encoded by DNA sequences, which can mutate. So they change by mutating.
There are many types of mutations:
Point mutations (delete, substitute or insert single or a few nucleotides).
Fusions.
Duplications (which aren’t restricted to entire genes, it can be smaller parts of genes)
Inversions.
Frameshifts.
And so on.
So we know the mechanism of change, that one we can definitely cross off the list.
What is “functional space A” and what is “functional space B”?
What is a “mathematical model that can repeatably get you” between them?
Do you even know what you are asking for? Is it a computer simulation? An equation? A theorem? What?
Obviously, selection can only select that which exists. Sure.
Suppose we don’t know how that specific enzyme evolved, then what?
I’m also confused you’d even ask this question, since to explain that we’d have to do phylogenetics, yet you don’t think phylogenetics explain how new genes evolve.
Of course there are millions of species of bacteria that manage to survive without flagella, and many different versions of flagella.
Texas sharpshooter argument. Draw the bulls-eye around the bullet hole.
There is no necessity for flagella to exist.
And yet flagellum exist! How? If everything everyone knows is wrong or insufficient, shall we pretend we know nothing at all? Why?
From Box 1 of the Nature Reviews Genetics paper I linked earlier:
Seems like more than 2 methods. In addition, if you feel the proposals are too “vague” you could consider looking up the references. Review papers of this sort are by nature and necessity high level overviews. Further reading is needed if you want the actual details.
I see others have already responded but I’ll briefly pick up on this.
colewd
I’ve heard the “lock-and-key” objection before and it isn’t a good analogy. Again you are assuming uniqueness, when many putative sequences may fill the same rôle and immediate exact specificity when systems only need to be good enough, and refining specificity is a task amenable to selection
Perhaps relevant: https://www.quantamagazine.org/20140122-a-new-physics-theory-of-life/
Alan Fox,
Why do you think you have large sequence liberty (each protein can have a wide variety of amino acid sequences) in building a multi protein complex that performs a specific function?
Many things wrong with the flagellum argument. I think the most important problem is assuming that a flagellum was pre-specified or anticipated or needed or whatever.
Second major thing wrong, as you point out, is that the claim that a functional flagellum is an all or nothing thing and cannot evolve incrementally.
Basically, when it became apparent that gaps in the fossil record were a losing argument, Behe invented gaps in the genome. This seemed safe, because genomes don’t leave fossils. The history of genome evolution is lost.
And that might be that, except it is possible to reconstruct ancestral sequences and ancestral intermediates.
Dave Carlson,
I see a third method is horizontal gene transfer but the paper you referenced said that is was a secondary method in modern animals vs bacteria.
The question that I have not seen answered in any of these papers is the origin if the original sequences i.e. you can shuffle an exon but where did the original exon sequence come from?
According to Alan Fox protein folds are so prevalent in sequential space that the size of the space does not matter. Do you believe this is the case for most proteins? If so why would proteins be conserved over so many generations?
GlenDavidson,
I hope that gpuccio will immediately retract this disgraceful insinuation that I possess intellectual honesty! My actual view is that we in fact have plenty of clue how such genes arise.
Here, for example, is another one, concentrating on our favourite taxonomic tribe, the Hominini. Favoured, no doubt, because of the cuteness of chimps …
colewd,
And yet clearly it isn’t, because variations on these functional sequences appear throughout life. These are the ‘differences’ we talk of, the things that vary upon the background fabric of common descent. Your argument is self-defeating, soon becoming god-of-the-gaps. There are no intermediates between viable sequences. Sez you. There isn’t a path of ‘needles’, just rather a lot of variant but unconnected needles. Again, sez you. Until one illuminates the entirety of space, it is deemed unconnected.
I do feel we’ve been through all this ‘sequence space’ once too often, though. This being Groundhog Day, colewd seems to have forgotten the bulk of the prior discussions.
Allan Miller,
🙂
That is a goalpost shift. You asked about the source of de novo genes. Several people provided you with cites to relevant papers.
There’s nothing wrong with starting a new topic, but you should recognize that your original question has been addressed.
Do you mean molecule by molecule back to the origin of life? Or what?
Where did a specific chicken come from?
Why is that a problem? All that says is that many different factors affect what makes a protein “functional” or not, what it doesn’t say is that this makes it harder to produce functional proteins. It seems you merely assume this.
Is it especially prevalent for nuclear proteins or did you just pull that one out of your ass? e
Yes, they can, this is just ANOTHER way to re-state the fact that there are deleterious mutations. But do they do that all the time?
Right, same thing again. How many times are you going to state the same principle in different words?
Only some sequences are functional, and the function is context-specific. We all already know this. Re-stating it in different words 6 times isn’t going to make evolution impossible.
How many needles are there in the haystack? What is the “density” of needles, how far are they apart? Do you actually know? Taking the proteins that actually exist, how far are they apart using the known mechanisms of change(the types of mutations I listed)?
Wouldn’t that be 4^100 ? Regardless, does that entire space have to be searched to find something functional, or to find a flagellum? And did the “search” start a totally random place, or did many of the proteins already exist and were they closeby in sequence space?
Because that is usually what is discovered when research into protein evolution of existing structures reveal.
Interestingly, a significant fraction of the proteins that make up the flagellum are homologous to each other, meaning ancestral stages of the flagellum proteins were “nearby” in sequence space to proteins used in later stages. So duplications happened, and the duplicates mutated, and found new functions as part of an altered structure.
Others are homologous to proteins used in functions that aren’t part of the flagellum (so either the flagellum proteins evolved from these, or these evolved from flagellum proteins, by duplication and divergence). So the flagellum, as is indicated by the evidence, didn’t have to evolve by evolving every single protein de novo, one at a time, from some totally random place in sequence space.
A vocabulary of ancient peptides at the origin of folded proteins.
Regardless of whether sequence space is highly saturated with function or not, proteins will be conserved if they form part of structures or functions that are themselves affecting many other components of cellular biochemistry and function.
For example, suppose you mutated the aminoacyl-tRNA-synthetases for Alanine so it instead started aminoacylating tRNA with Leucine. This would literally affect every single protein in the cell. So because the function of that protein is so deeply ingrained in all cellular function, there is an extremely low chance it can change without negatively affecting something else critical for cellular function. It would be absurd to think all the proteins of the cell could replace all of their alanines with Leucine and retain their function.
Interestingly, though, there are some amino acids that are latecomers to life (the last parts of the genetic code to evolve), such as tryptophan. There have been experiments done where the codons for tryptophan was replaced with codons for an artificial amino acid, that subsequently managed to replace every amino acid of tryptophan in every protein the cell used.
Chemical Evolution of a Bacterial Proteome.
There are over twenty thousand Tryptophane amino acid codons, in all the proteins coded for by the E coli genome. Every single one was replaced through evolution, by a synthetic but chemically related amino acid with two carbon atoms (in Tryptophan) replaced by a sulphur atom in an aromatic ring (in Tpa). The experiment ran for about two years, with slow selection gradually adapting the cells to replace all the Tryptophan amino acids in the entire proteome, with the synthetic Tpa amino acid.
Simply put, this should be impossible if you are correct. It isn’t, so you are not correct.
Writing comments in Gordon (KairosFocus) Mulling’s patronizingly arrogant style will also get you banned.
Rumraket,
So your story (or Ken Miller’s) is that some of the flagellum proteins have similar sequences to other proteins inside e coli bacteria. All those similar proteins mutated slightly so they could come together and build an outboard motor every 20 minutes or when e coli cell division occurs. This process evolved by random unguided processes through 4^1000000 of sequential space? Remember, there is no selective advantage until the motor is built. I cannot wrap my head around this possibility maybe you can.
Rumraket,
I am struggling to see how a experiment with synthetic amino acid replacements supports blind unguided process creating new proteins.
*facepalm*
double facepalm
colewd,
A reminder of what you wrote in February:
I too am looking forward to the discussion. When is it going to happen?
colewd,
We have both read and cited the papers on this and the range is 10^11 to 10^64 for 100 AA. All these numbers are a problem for current evolutionary theory and I believe the sequence function is rarer for eukaryotic nuclear proteins. Your argument requires that you underestimate the problem in order to fit the theory IMHO.
Patrick,
This is true, but part of the original question has been addressed, not the entire question. There are certainly de novo genes through out evolution that require new sequences.
Patrick,
I have read Wagner’s book and enjoyed it.
Don’t you understand that would only be a problem if every time an organism reproduces, the resulting genome was completely randomized? That’s certainly not what happens so stop pulling the ridiculous tornado-in-a-junkyard strawman please. It’s really not that hard, FFS
The range of what? I can’t parse this as an answer to the questions asked, except possibly “how big is the haystack”, which was the least interesting of the questions. It doesn’t matter how big the haystack is if all the needles are collected in a small region or if the density of needles is high. That’s what you ought to be asking. One question you might ask yourself is what percentage of 100-amino-acid polypeptides have some selectable function in some environment, which is more useful than asking what percentage could occupy a particular spot in a particular flagellum.
colewd,
Um, did you notice that it destroys your “islands of function” and “needle in a haystack” objections?
All that is required (it should be obvious) is that the neighbourhood contains function. The absolute size of the space is doubleplusirrelevant, as is its average functional density. Especially when one considers that, at a great many residues, all that is required is a polar (or a hydrophobic) acid, it don’t matter what it is. People might be a tad more convincing if they stopped blindly sticking the number 20 in as exponent at every opportunity.
John Harshman,
The size of the haystack is the most important question if you are not near the region which performs a function that can add survivability to the organism. Especially if the needles are all bunched together. At some point a new enzyme does not add survivability you need another function like mobility.
No, the density of the haystack is the most important question. If you are not near the region yadda yadda, that could happen in a big or small haystack. Now, if all the needles are bunched together in a big haystack, that means that a random sequence is unlikely to have any function, but it also means that any function can be reached from some existing protein, in which case gene duplication and divergence would be the main factor. And if density is high all over, new proteins from random bits of junk can be a big source. In neither case is the size of the haystack, per se, very relevant.
Oh, and I think you may have lost sight of what the needles represent in the analogy.
You also failed to explain what you meant, which I would appreciate. What were your numbers intended to be?
Cole’s tactic is effin hilarious:
1. Deny that there are evolutionary mechanisms
2. Ignore the responses and how the known mechanisms constrain genetic variation
3. Pretend that according to evolution, “random mutation” means that the entire genome is randomized
4. Assume that the sequence space is immutable and determined by each organism’s genome size (because of course, evolution is false)
5. Calculate some bogus probability based on imaginary infinite numbers.
6. Therefore gawd