by Joe Felsenstein and Michael Lynch
The blogs of creationists and advocates of ID have been abuzz lately about exciting new work by William Basener and John Sanford. In a peer-reviewed paper at Journal of Mathematical Biology, they have presented a mathematical model of mutation and natural selection in a haploid population, and they find in one realistic case that natural selection is unable to prevent the continual decline of fitness. This is presented as correcting R.A. Fisher’s 1930 “Fundamental Theorem of Natural Selection”, which they argue is the basis for all subsequent theory in population genetics. The blog postings on that will be found here, here, here, here, here, here, and here.
One of us (JF) has argued at The Skeptical Zone that they have misread the literature on population genetics. The theory of mutation and natural selection developed during the 1920s, was relatively fully developed before Fisher’s 1930 book. Fisher’s FTNS has been difficult to understand, and subsequent work has not depended on it. But that still leaves us with the issue of whether the B and S simulations show some startling behavior, with deleterious mutations seemingly unable to be prevented from continually rising in frequency. Let’s take a closer look at their simulations.
Basener and Sanford show equations, mostly mostly taken from a paper by Claus Wilke, for changes in genotype frequencies in a haploid, asexual species experiencing mutation and natural selection. They keep track of the distribution of the values of fitness on a continuous scale time scale. Genotypes at different values of the fitness scale have different birth rates. There is a distribution of fitness effects of mutations, as displacements on the fitness scale. An important detail is that the genotypes are haploid and asexual — they have no recombination, so they do not mate.
After giving the equations for this model, they present runs of a simulation program. In some runs with distributions of mutations that show equal numbers of beneficial and deleterious mutations all goes as expected — the genetic variance in the population rises, and as it does the mean fitness rises more and more. But in their final case, which they argue is more realistic, there are mostly deleterious mutations. The startling outcome in the simulation in that case is there absence of an equilibrium between mutation and selection. Instead the deleterious mutations go to fixation in the population, and the mean fitness of the population steadily declines.
Why does that happen? For deleterious mutations in large populations, we typically see them come to a low equilibrium frequency reflecting a balance between mutation and selection. But they’re not doing that at high mutation rates!
The key is the absence of recombination in these clonally-reproducing haploid organisms. In effect each haploid organism is passed on whole, as if it were a copy of a single gene. So the frequencies of the mutant alleles should reflect the balance between the selection coefficient against the mutant (which is said to be near 0.001 in their simulation) versus the mutation rate. But they have one mutation per generation per haploid individual. Thus the mutation rate is, in effect, 1000 times the selection coefficient against the mutant allele. The selection coefficient of 0.001 means about a 0.1% decline in the frequency of a deleterious allele per generation, which is overwhelmed when one new mutant per individual comes in each generation.
In the usual calculations of the balance between mutation and selection, the mutation rate is smaller than the selection coefficient against the mutant. With (say) 20,000 loci (genes) the mutation rate per locus would be 1/20,000 = 0.00005. That would predict an equilibrium frequency near 0.00005/0.001, or 0.05, at each locus. But if the mutation rate were 1, we predict no equilibrium, but rather that the mutant allele is driven to fixation because the selection is too weak to counteract that large a rate of mutation. So there is really nothing new here. In fact 91 years ago J.B.S. Haldane, in his 1927 paper on the balance between selection and mutation, wrote that “To sum up, if selection acts against mutation, it is ineffective provided that the rate of mutation is greater than the coefficient of selection.”
If Basener and Sanford’s simulation allowed recombination between the genes, the outcome would be very different — there would be an equilibrium gene frequency at each locus, with no tendency of the mutant alleles at the individual loci to rise to fixation.
If selection acted individually at each locus, with growth rates for each haploid genotype being added across loci, a similar result would be expected, even without recombination. But in the Basener/Stanford simulation the fitnesses do not add — instead they generate linkage disequilibrium, in this case negative associations that leave us with selection at the different loci opposing each other. Add in recombination, and there would be a dramatically different, and much more conventional, result.
Technical Oddities
Most readers may want to stop there. We add this section for those more familiar with population genetics theory, simply to point out some mysteries connected with the Basener/Stanford simulations:
1. One odd assumption that they make is that any fitness class that has a frequency below 1 part in a billion gets set to 0. This is not a reasonable way to take genetic drift into account, as all fitness classes are subject to random fluctuations. We imagine such a treatment is a minor issue, relative to the enormous mutation pressure imposed in their study. But someone should check this, which can be done as their Javascript source can be downloaded and then made comprehensible by a Javascript beautifier.
2. The behavior of their iterations in some cases is, well, weird. In the crucial final simulation, the genetic variance of fitness rises, reaches a limit, bounces sharply off it, and from then on decreases. We’re not sure why, and suspect a program bug, which we haven’t noticed. We have found that if we run the simulation for many more generations, such odd bouncings of the mean and variance off of upper and lower limits are ultimately seen. We don’t think that this has much to do with mutation overwhelming selection, though.
3. We note one mistake in the Basener and Sanford work. The organisms’ death rates are 0.1 per time step. That would suggest a generation time of about 10 time steps. But Basener and Stanford take there to be one generation per unit of time. That is incorrect. However the mutation rate and the selection coefficient are still 1 and 0.001 per generation, even if the generations are 10 units of time.
Joe Felsenstein, originally trained as a theoretical population geneticist, is an evolutionary biologist who is Professor Emeritus in the Department of Genome Sciences and the Department of Biology at the University of Washington, Seattle. He is the author of the books “Inferring Phylogenies” and “Theoretical Evolutionary Genetics”. He frequently posts and comments here.
and
Michael Lynch is the director of the Biodesign Center for Mechanisms of Evolution at Arizona State University, and author of “The Origins of Genome Architecture” and, with Bruce Walsh, of “Genetics and Analysis of Quantitative Traits”. Six of his papers are cited in the Basener/Stanford paper.
Joe and Michael,
I’ll bet it was difficult to resist the temptation to refer to “the B and S simulations” as “the BS simulations”. 🙂
Here’s a link to the PDF of the Basener/Sanford paper.
J & M:
They haven’t provided their original, readable source code?
These things are not my thing bUT its still doing math on a presumption of mutations actually happening and creating biology.
its like math done that proves there mUST be biology out in the universe or math that proves there can’t be. I’ve seen both say it.
Mutationism in genes as a origin for populations arriving and existing , so many zillions of times, to justify the complexity and diversity we have is unlikely if math matters in these things.
both sides are squeezing numbers on a untested hypothesis.
They have made available their Javascript. It is in a compressed form (with indentation omitted and with the statements not separated onto different lines). But there are Javascript beautifier websites out there, so I used one on it. I found that their code is not commented, but that it has comprehensible variable names. So I acquit them of any charge of trying to have it be incomprehensible, otherwise they would have used variable names like aa, ab, ac, …
I was able to understand it and relate it to the variables in their math.
Joe,
How did you manage to get Mike Lynch involved in all this?
Mike is an old friend. He has actually gotten involved in refuting creationist arguments before, as in his papers refuting Behe and Snoke some years ago. I think his only hesitancy was that he did not know as much about which blog was which, and what level of explanation was needed. Since his ox was gored multiple times in the Basener/Sanford paper, that helped motivate him.
J & M:
How do they combine the fitnesses? (I haven’t read the paper yet.)
Well, as long as they didn’t forget anything important…
Recombination, the minor detail.
Glen Davidson
Would bacteria qualify as without recombination? I have wondered why they do not become extinct.
Something must be wrong with the model or with the parameters.
They don’t combine them, they just come up with a distribution of overall fitnesses of the mutants.
If they had separate loci, randomly associated (no linkage disequilibrium), then the fitness of a haploid genome would be the sum of the individual locus fitnesses, if those loci did not interact.
The result would be that the distribution of overall fitnesses would be (in technical terms) a convolution of the distributions at the individual loci, and probably something like a distribution of a convolution of a Poisson-distributed number of Normal distributions. A lot different from the distribution that they use.
That, it turns out, matters — a lot.
Bacteria do have a number of different kinds of recombination system, all a bit odd and haphazard: See Wikipedia: bacterial recombination.
I should add that Javascript is not, AFAIK, compiled, so the executable Javascript they made available is the source code. I think that it is an interpreted language, and it looks a lot like C.
Unformatted and with no comments? D’oh!
All asexual lineages qualify (although there are some backdoors available, as Joe pointed out). But bacteria typically have smaller genomes, which reduces the number of deleterious mutations they acquire per generation per individual.
Well, it was to be expected that the “realistic case” turned out to not be realistic.
Nice post, Joe. Thanks.
What are the definitions of beneficial and fitness? The detrimental mutations are defined as such a priori?
If I read Joe’s OP correctly, detrimental mutations reduce the birth rate (which negatively affects population growth rate):
And beneficial ones increase the birth rate? And this is all determined a priori which they will be?
And how about fitness, what are we calling that?
In summary, the B.S. model shows that you can get deleterious mutations to overcome selection if you set the parameters of your simulation for failure.
Now that’s surprising.
The tone of this JF and LM post seems to me to be ……not sure how to put this…..of respectful criticism. There are no harsh words for the authors or their motivations. But the flaws in the paper seem pretty fundamental.
The work of the prominent IDers ranges from superficially legitimate arguments with only minor flaws to disgracefully bad arguments coupled with outright lies. Where on the spectrum is this? Should the reviewers be faulted for even publishing it or are the problems associated with it of the sort that some working scientists might make?
Why don’t you write Basener and Sanford with your questions? It must apply equally to them.
OP stickied
A couple of questions:
Have recombinations been observed in LTEE? Or have they been observed to rescue any wild population? Any at all?
Regarding small genomes: doesn’t that imply less junk and the relative risk that a mutation would be critical? Isn’t it the rate of mutation in functional sequences that counts, rather than the number?
Just one more question:
How does the number of generations in LTEE compare with the number of generations in the OP simulation? How long will LTEE have to run in order to test meltdown?
Hi Joe
Thank you for this excellent OP.
However, I am confused. I must be really missing something because of my lack of training in Bioinformatics.
Why would B&S suggestions even blip on the any collective radar given Muller’s long-standing “Ratchet” should presumably slay any such shibboleth?
Secondly – I did read through Arlin’s paper and again I am confused!
Isn’t Arlin’s reassessment of history really nothing more than a rehash of history?
I was under the impression that much of what Arlin describes as inadequacies of OMS; have in fact been already addressed.
Gould’s Spandrels jump to mind as just one “for example”…
What am I missing?
Yes:
The Javascript beautifier took care of the formatting issue by indenting the code nicely. The variable names were also informative. And I did not bother to try to figure out all the code for producing the output — just the part that changed the variables.
The model is one in which time is continuous (in the program it advances in small steps). Each genotype has a birth rate and a death rate. In such a model the fitness is the difference between the birth rate and the death rate of the genotype. (It is the net rate of increase).
And yes, phoodoo, the fitnesses are determined a priori. Because the object of the whole calculation is to see how the mix of genotypes changes in the presence of new mutations.
Population genetic theory answers interesting questions like that, completely ignoring people who holler “but how do you know what the fitness is?” Because both Basener/Sanford and Mike Lynch and I know that the “what if” questions are important.
The problems are of the sort that some working scientists might make — but the reviewers should then be faulted for not noticing them.
Basener and Sanford’s work argues that there is a failure of selection to hold deleterious mutations to low frequencies, even in populations so large that there will be no Muller’s Ratchet. So it is important to evaluate the reasons for their simulation behaving as it did.
(You also asked about Arlin’s argument). Arlin is addressing a completely different issue, namely what is the point at which our understanding of evolutionary processes has changed enough to merit renaming the Modern Synthesis. A different issue entirely.
Thank you for clarification… as I mentioned; my foundation in Bioinformatics is inadequate to follow all the ins & outs of your OP.
Regarding Arlin’s post – I understand that he has raised a different issue altogether – but since he raised the question: how about a quick response to my embedded link asking whether he is in fact mistaken?
No, I think it is more useful to have this thread concentrate on the Basener/Sanford model.
Joe Felsenstein,
As always, I remain in your debt! On behalf of my students… THANK YOU!
Good question phoodoo!
Aren’t flightless birds the perfect examples of deleterious mutations overpowering natural selection?
Isn’t the loss of flight the realistic evidence that natural selection is unable to prevent the continual decline of fitness?
No.
Why not?
It’s a pure loss of function due to deleterious mutations…
Flightless birds’ flat breastbones lost the keel that anchors the strong pectoral muscles required for flight… How is that beneficial and an increase in fitness?
If you don’t have a clue how to answer the argument, don’t say NO, just because you believe something or you would like to be the way your believe it…
Joe,
Thank you and Michael Lynch for reading Bill Basener and John Sanford’s paper on Fisher’s theorem. We (myself, and others) have alerted Bill and John to this posting at TSZ.
I think Bill and John will be appreciative of your review and criticisms and the attention you have drawn to their work.
That’s about all my role in this for the time being.
Thanks again.
I hope they’ll join the discussion.
It’s astonishing that someone could spend as much time as you do writing stuff about evolution, and still be this clueless about the bare basics of the theory.
Given that is the case, can you plot a graph for me such that it shows when the genome was perfect, i.e. most fit (presumably garden of eden?), and when it will decline to such a level that survival will be impossible?
If you can’t actually do that, don’t you think you should take a step back and reconsider your statement?
What the heck are you even saying Faizel? You are unable to address his issue, but you want to be a good cheerleader? You can’t even reach the level of criticizing his post. Its more like you are criticizing yourself for being unable to think of anything to say.
Catch 22… therefore no direct answers… 😉
Phoodoo and J-mac, please create a new thread with your inanities. They are simply not relevant here.
Rumraket,
Look like Rumcrackhead is goin to take over as an admin soon…
Is this work just a rehash of Stanford’s dog turd Mendel’s Accountant program from a few years back? The one which was written so whatever values were entered the population always went extinct from “genetic entropy”?
A real evolution killer that turned out to be. 😀
J-Mac,
Then when bats evolved flight from their non-flying ancestors around 52 MYA that was a gain of function and increase in fitness, right?
Or to an IDiot is it the bats lost the ability to not fly?
What were they doing before they could fly, falling out of trees, and bumping into moths?
Living on the ground and in trees like their mammalian cousins chipmunks and squirrels.
Are you claiming bat evolution was a “loss of information” too?
Eating acorns?