# Natural selection can put Functional Information into the genome

It is quite common for ID commenters to argue that it is not possible for evolutionary forces such as natural selection to put Functional Information or Specified Information) into the genome. Whether they know it or not, these commenters are relying on William Dembski’s Law of Conservation of Complex Specified information. It is supposed to show that Complex Specified Information cannot be put into the genome. Many people have argued that this theorem is incorrect. In my 2007 article I summarized many of these objections and added some of my own.

One of the sections of that article gave a simple computational example of mine showing natural selection putting nearly 2 bits of specified information into the genome, by replacing an equal mixture of A, T, G, and C at one site with 99.9% C.

This post is intended to show a more dramatic example along the same lines.

Suppose that we have a large population of wombats and we are following 100 loci in their genome. We will make the wombats haploid rather than diploid, to make the argument simpler (diploid wombats would give a nearly equivalent result). At each locus there are two possible alleles, which we will call 0 and 1. We start with equal gene frequencies 1/2 and 1/2 of these two alleles at each locus. We also assume no association (no linkage disequilibrium) between alleles at different loci. Initially the haploytypes (haploid genotypes) are all combinations from 00000…000 to 11111…111, all equiprobable.

Let’s assume that the 1 allele is more fit than the 0 allele at each locus. The fitness of 1 is 1.01, and the fitness of 0 is 1. We assume that the fitnesses are multiplicative, so that a haploid genotype with M alleles 1 and 100-M alleles 0 has fitness 1.01 raised to the Mth power. Initially the number of 1s and 0s will be nearly 50:50 in all genotypes. The fraction of genotypes that have 90:10 or more will be very small, in fact less than 0.0000000000000000154. So very few individuals will have high fitnesses.

What will happen to these multiple loci? This case results in the gene frequency of the 1 allele rising at each locus. The straightforward equations of theoretical population genetics show that after 214 generations of natural selection, the genotypes will now have gene frequency 0.8937253. The fraction of genotypes having 90:10 or more will then be 0.500711. So the distribution of genotypes has moved far enough toward ones of high fitness that over half of them have 90 or more 1s. If you feel that this is not far enough, consider what happens after 500 generations. The gene frequencies at each locus are then 0.99314, and the fraction of the population with more than 90 1s is then more than 0.999999999.

The essence of the notion of Functional Information, or Specified Information, is that it measures how far out on some scale the genotypes have gone. The relevant measure is fitness. Whether or not my discussion (or Dembski’s) is sound information theory, the key question is whether there is some conservation law which shows that natural selection cannot significantly improve fitness by improving adaptation. My paper argued that there is no such law. This numerical example shows a simple model of natural selection doing exactly what Dembski’s LCCSI law said it cannot do. I should note that Dembski set the threshold for Complex Specified Information far enough out on the fitness scale that we would have needed to use 500 loci in this example. We could do so — I used 100 loci here because the calculations gave less trouble with underflows.

I hope that ID commenters will take examples like this into account and change their tune.

Let me anticipate some objections and quickly answer them:

1. This is an oversimplified model, you were not realistic. Dembski’s theorems were intended to show that even in simple models, Specified (or Functional) Information could not be put into genomes. It is therefore appropriate to check that in such simplified models, where we can do the calculation. For if natural selection is in trouble in these simple models, it is in trouble more generally.

2. You have not allowed for genetic drift, which would be present in any finite population. For simplicity I left it out and did a completely deterministic model. Adding in genetic drift would complicate the presentation enormously, but would still result in the achievement of a population with all 11111…1111 genotypes after only a modest number more of generations.

3. If fitness differences are due to inviability of some genotypes, fitnesses could not exceed 1. Yes, but making the 0 allele have fitness 1/1.01 = 0.9900990099… and the 1 allele have fitness 1 could then be used, and the results would be exactly the same, as long as the ratio of fitnesses of 0 and 1 is still 1:1.01.

4. You just followed gene frequencies — what about frequencies of haplotypes? This case was set up with multiplicative fitnesses so that there would never be linkage disequilibrium, so only gene frequencies need to be followed.

I trust also that people will not raise all sorts of other matters (the origin of life, the bacterial flagellum, the origin of the universe, quantum mechanics, etc.) To do so would be to admit that they have no answer to this example, which shows that natural selection can put functional information into the genome.

0
This entry was posted in Uncategorized by Joe Felsenstein. Bookmark the permalink.

Been messing about with phylogenies, coalescents, theoretical population genetics, and stomping bad mathematical arguments by creationists for some years.

## 228 thoughts on “Natural selection can put Functional Information into the genome”

1. I’m being argumentative because I think you have falsely characterized functional sequences as isolated. I’m waiting for the evidence.

0
2. olegt: So, you are starting with a population that is initially near a peak of its fitness. The fitness landscape changes and the peak shifts away from its previous position. The population will follow the gradient toward the new (local) maximum. Once it reaches the maximum, it will stay there. Right?

Yes.

0
3. SCheesman: Why do you think I see a problem with that?

Because earlier you wrote this:

SCheesman: OK, then I need to add minor and limited. Eventually the gradient returns to the mean, like a random walk in a large bowl. This is not what evolution requires, in a nutshell.

We’ve agreed that a change in the fitness landscape would lead the population to the new location of the fitness maximum. Why do you think it is at odds with Darwinian evolution?

0
4. petrushka: I’m being argumentative because I think you have falsely characterized functional sequences as isolated. I’m waiting for the evidence.

I am coming to realize that you see everything as absolutes. I suggest that some functional sequences are isolated, and you accuse me of denying that any are connected. I say that there are things that simple RM+NS cannot accomplish, and you accuse me of ignoring experiments that change the proportions of alleles in a population. Even Elizabeth has offered examples of things she struggles with, but with you there is no difficulty anywhere, not a shred of skeptism or hesitation. Who is the “true believer” here? Why don’t you show me evidence of the power of RM+NS to evolve anything more than trivial changes, ones that demonstrably improve the fitness of a population over their form “in the wild”, or ones that create de novo some new multi-comonent biological system that confers a new survival benefit? As part of the work, relate (as does Lenski, whose work I greatly admire) changes to specific genomic variations.

0
5. What about it? No one can demonstrate that such a thing is possible- ya I know that you think there is fossil evidence to support it but you need genetic/ biological evidence, not speculations based on what you think you see in the fossil record.

0
6. It still doesn’t do anything- it doesn’t construct new, useful multi-protein configurations.

So, sure, you can have that, because with that you still have nothing.

Good work…

0
7. Tell me how the inner ear is not a complex new function created by incremental change.

0
8. No, I’ve been answering your questions long enough now. Tell me how RM+NS is sufficient to explain the evolution of the inner ear by incremental change. You’re good at posing questions. How good are you at answering them? What environmental changes led to the change? Why is the current system superior to the previous one, based on fitness? What specific evolutionary mechanisms were responsible? What were the key mutational changes and how did they proceed?

0
9. Actually you haven’t answered the question I’ve been asking. I really have only one big question, and that is how do you know that function is isolated? What is your evidence?

If you have answered that, you could point me to the answer. Perhaps I missed it.

Now I think on the question of the inner ear you are simply being absurd. We have ample evidence that variation and selection can modify the shape and size of bones, even within the span of a few hundred years. Dog breeds are a prime example.

But anyway, if you actually believe that the inner ear evolution is in question, I at least know something more about your thought processes.

0
10. petrushka: Actually you haven’t answered the question I’ve been asking. I really have only one big question, and that is how do you know that function is isolated? What is your evidence?If you have answered that, you could point me to the answer. Perhaps I missed it.Now I think on the question of the inner ear you are simply being absurd. We have ample evidence that variation and selection can modify the shape and size of bones, even within the span of a few hundred years. Dog breeds are a prime example.But anyway, if you actually believe that the inner ear evolution is in question, I at least know something more about your thought processes.

The evolution of any system that has been described as irreducibly complex. You might be able to change the size of dog’s bones by breeding, but that is a universe away from creating an entire system of hearing and balance (which is what the inner ear provides, with numerous multi-component systems), which somehow, thorugh a series of lucky accidents, is supposed to have originated from a jaw.

I you don’t like my examples, use Elizabeth’s – OOL and the ribozome. These give you no inner doubts of the efficacy of natural processes?

0
11. The evolution of any system that has been described as irreducibly complex. You might be able to change the size of dog’s bones by breeding, but that is a universe away from creating an entire system of hearing and balance

What part of the inner ear system is not represented by transitional forms?

0
12. petrushka: What part of the inner ear system is not represented by transitional forms?

Any part that doesn’t fossilize, I expect. You tell me what is.

0
13. SCheesman: “They pose the problem differently; in their case you need to specify all the bits to achieve a particular function, like a combination lock, or a series of right/left turn directions to get you to a specific destination. ”

I think we can agree that all of the steps, (i.e. “information”), don’t have to be there for it NOT to work.

All the bit combinations however, are “explored” in “n life-form/generations”.

That’s the take away point, that the “key” we are looking for can be obtained in “n steps”, not “(2^n) steps”.

When that last bit, in the last generation flips, we have our functionality, and we have it in “n” bits, not (2^n) bits.

0
14. Toronto: This OP is about natural selection leading to Functional Information.I think we can agree that all of the steps, (i.e. “information”), don’t have to be there for it NOT to work.All the bit combinations however, are “explored” in “n life-form/generations”.That’s the take away point, that the “key” we are looking for can be obtained in “n steps”, not “(2^n) steps”.When that last bit, in the last generation flips, we have our functionality, and we have it in “n” bits, not (2^n) bits.

The problem in the OP in no way corresponds to the “key in the lock” problem. As it is laid out it is exactly correct. It starts with a non-zero fitness and there is a gradient each step of the way. For the key in the lock problem there is no gradient, as only if all of the bits match exactly is the fitness non-zero. Personally, I have no problem assenting to the proposition, that in situations where the model as described at the start of this OP faithfully maps onto the real-life genome, then the results will occur as predicted and fitness will increase with mathematical precision. This OP, if I am not mistaken, is pretty well identical to the “Me Thinks It is Like a Weasel” problem, translated into bits.

0
15. SCheesman: The problem in the OP in no way corresponds to the “key in the lock” problem. As it is laid out it is exactly correct. It starts with a non-zero fitness and there is a gradient each step of the way. For the key in the lock problem there is no gradient, as only if all of the bits match exactly is the fitness non-zero. Personally, I have no problem assenting to the proposition, that in situations where the model as described at the start of this OP faithfully maps onto the real-life genome, then the results will occur as predicted and fitness will increase with mathematical precision. This OP, if I am not mistaken, is pretty well identical to the “Me Thinks It is Like a Weasel” problem, translated into bits.

We’ve gone over this many times already. This thread is about situations where fitness gradients lead a population to a fitness peak. It is emphatically not about completely random landscapes. Can we stop chasing this rabbit trail? If you are interested in defending the idea of random fitness landscapes, why not open a separate thread?

0
16. SCheesman,

SCheesman: “The problem in the OP in no way corresponds to the “key in the lock” problem.”

1) The only way the “key in the lock” problem CAN be solved is by “information”. according to ID.
2) Can this “information” be generated by natural selection as addressed by this OP?

Imagine a population of (65,535 – 1) where your target window is 16 bits.

There is a very good chance that one out of those individuals is only one bit AWAY from your ID “IC target”.

From that one individual, we can expect the “target” to be reached within 65,535 offspring.

With two surviving offspring from each generational parent along this lineage, we would expect a doubling of the population every generation leading to the target being reached from the base generation within 16 generations providing only one bit changes from the base for each offspring.

Mathematically, IC is not a problem, but at that point, the UPB is meaningless and Dembski’s argument needs biological help, not mathematical.

0
17. SCheesman,

It’s only 16 offspring at a single bit change when evolution as defined by evos is used.

0
18. olegt: We’ve gone over this many times already. This thread is about situations where fitness gradients lead a population to a fitness peak. It is emphatically not about completely random landscapes. Can we stop chasing this rabbit trail? If you are interested in defending the idea of random fitness landscapes, why not open a separate thread?

I was not criticizing the thread for the problem it posed, merely trying to explain to Toronto the difference in the two situations. Apparently I am the only one here trying to shine light of the differences.

0
19. Anyway, do all of you feel this is played out? Lets start fresh elsewhere. I may or may not jump into the next thread(s).

0
20. I think this thread is played out; I may come in on a newer one at some point.

0
21. SCheesman,

SCheesman: “I was not criticizing the thread for the problem it posed, merely trying to explain to Toronto the difference in the two situations. Apparently I am the only one here trying to shine light of the differences.”

The weakness for ID is that “IC” is NOT a problem when it comes to its generation by natural (non-ID), processes as claimed by this post.

Even starting from “random” scratch in a population, if the population is large enough, the “random” start argument means that an n bit window, will statistically have a single population member 1 bit away from “IC” functionality in a population of (2^n) individuals.

Starting with that single member, (i.e. now an island of functionality), a single bit change from that base, can appear statistically, within 16 of its offspring.

0
22. SCheesman: I was not criticizing the thread for the problem it posed, merely trying to explain to Toronto the difference in the two situations. Apparently I am the only one here trying to shine light of the differences.

You’re the only one here who thinks that random landscapes have any relevance to this thread. They don’t.

0

This site uses Akismet to reduce spam. Learn how your comment data is processed.