Evolution Visualized

Posted on September 9, 2016 by petrushka

445 thoughts on “Evolution Visualized”

JoeCoder on September 14, 2016 at 4:55 pm said:

Rumraket: “Every time this has been done it turns out the nucleotide and genome-size differences are well within plausible evolutionary rates.”

Certainly. But evolutionarily, raw bits are very easy to come by and they’re not what I’m interested in. I think there are multiple ways to define information, but for our purposes a simple definition will suffice: Nucelotides that must have a specific letter, or else function is degraded. Even if only slightly. As well as “informatin” I also call this “strict” function. As opposed to the “loose” function for which ENCODE’s 80% estimate was criticized. Binding sites would have a higher density of strict function, while many four-fold degeneracy sites would not be strictly functional. Even though removing four-fold degeneracy sites would break a gene, they’re still needed to maintain codon alignment.

So how can we estimate the amount of strict function in genomes? Three approaches that I’ve come across:

First, genome wide association studies. Figure S1B from this paper estimates that only “only 4.9% of GWAS SNPs lie in coding sequence”.

If 2% of the genome is protein coding and if 50% of nucleotides within coding sequences are functional, then we can extrapolate total function. That gives us 2% * 50% / 4.9% = 20.4% of the genome being strictly functional. I say “at least” because our genomes have redundant backups for when one system fails, and these numbers don’t account for that.

Second, ENCODE 2012 found that “even with our most conservative estimate of functional elements (8.5% of putative DNA/protein binding regions) and assuming that we have already sampled half of the elements from our transcription factor and cell-type diversity, one would estimate that at a minimum 20% (17% from protein binding and 2.9% protein coding gene exons) of the genome participates in these specific functions, with the likely figure significantly higher.”

Although not every nucleotide within exons and binding sites is strictly functional, but this 20% estimate also excludes other functional elements.

Third, when conserved RNA structure is taken into account as well as DNA, we find that >20% is conserved, as a lower bound: “>20% of the human genome is subjected to evolutionary selection.” Assuming I’m following their work correctly.

Estimates based on Haldane’s work tell us that given typical human population genetics, it takes about 10 million years just to get about 1600 beneficial mutations, or in my terms 1600 nucleotides of information. So it seems difficult for evolution to produce 3 billion * 20% = 600 million nucleotides of information in our genomes. Even when we grant the shorter generation times and higher fecundity rates of earlier animals improving the rate and strength of selection.

Having 20% of the genome strictly functional would also lead to an unbearable genetic load, leaving evolution unable to even preserve that much information. Deleterious mutations almost always have higher selection coefficients than beneficial mutations. When the whole selection budget is spent trying and failing to remove deleterious mutations, it becomes even more difficult for beneficial mutations to fix.

This also generously assumes whole genomes can be built one mutation at a time, and that traversing neutral or deleterious space is never needed. I expect sometimes it would be, but I have no way of quantifying that so I’m not factoring that in to these calculations.

I am also not arguing that 80% of the genome is junk. I think some of it is junk and some of it fits a looser definition of function. But anyway let me know your thoughts and if you can spot any errors in my thinking. Thanks for the discussion so far : )
Rumraket on September 15, 2016 at 12:49 pm said:

JoeCoder:
Rumraket:“Every time this has been done it turns out the nucleotide and genome-size differences are well within plausible evolutionary rates.”

Certainly. But evolutionarily, raw bits are very easy to come by and they’re not what I’m interested in. I think there are multiple ways to define information, but for our purposes a simple definition will suffice:
Nucelotides that must have a specific letter, or else function is degraded. Even if only slightly.

Okay, how do you calculate that, using a known example from basic genetics and biochemistry?

I want a calculation, a formula I can use, put in a genetic sequence of known function, and compute the quantity of “functional information” in that sequence.

A good example is the Lac-Operon. We know how big it is in nucleotide letters, we know it’s sequence, we know what all the elements in the Lac-operon do.

What is the quantity of functional information in the Lac-operon? In bits, please. How many bits of functional information is there in the Lac-operon?

And then I want to use it to see how much functional information would be created (or destroyd) if, for example, a duplication of another gene from the E coli genome, is placed inside the Lac-Operon, followed by a frameshift mutation in the duplicate gene.

Then we pick out a single piece of the Lac-operon, which is the sequence of DNA that codes for the amino-acid sequence of the Beta-Galactosidase enzyme, and then we calculate the amount of functional information in the protein-coding DNA sequence of the Beta-Galactosidase enzyme.

Then using this formula, we go and take a look at other examples. For example, we could calculate the amount (the quantity, in bits) of functional information was created (or lost) when the gene-duplication in the Long-term evolution experiment that put the Citrate-transporter under control of a different promoter.

Until you can calculate this, your definition is worthless and we will not be able make use of it to make calculations.

Estimates based on Haldane’s work tell us that given typical human population genetics, it takes about 10 million years just to get about 1600 beneficial mutations, or in my terms 1600 nucleotides of information.

First, I think you will have to dig up a reference for that and put it to a population geneticist.
We hve some around here, such as Joe Felsenstein. I don’t have much insight in population genetics so this will be better suited for him.

So it seems difficult for evolution to produce 3 billion * 20% = 600 million nucleotides of information in our genomes. Even when we grant the shorter generation times and higher fecundity rates of earlier animals improving the rate and strength of selection.

The human genome evolved from the genome of the common ancestor we share with Chimps. The Chimp genome is virtually identical to ours. In other words, very very very little of the human genome has been subject to evolutionary change and selection since we and chimps split from a common ancestor.

If the proportion of the human genome that is functional, is 20% of roughly 3 billion, then it’s 600 million nucleotides. And the proportion of the Chimpanzee genome that is functional is also 20%, then it’s also about 600 million nucleotides. And the parts of the genome that are functional are pretty much the same. Then what is the problem again?

Having 20% of the genome strictly functional would also lead to an unbearable genetic load, leaving evolution unable to even preserve that much information.

Could you perhaps supply the calculations that support this contention?

Not that I think it matters that much, since people who know what they are talking about (among which I would not put ENCODE scientists) have estimated the proportion of the genome that is functional to be less than or equal to about 10%.

But regardless if it’s 10% or 20%, the same would be true for chimpanzees. And the gorilla genome is not that far off, and so on and so forth.

Deleterious mutations almost always have higher selection coefficients than beneficial mutations. When the whole selection budget is spent trying and failing to remove deleterious mutations, it becomes even more difficult for beneficial mutations to fix.

Wat? Literally, what the hell are you talking about? Does selection have a budget? Does that budget get spend by removing deleterious mutations before it can start fixing beneficial ones?

Where are you getting this crap from? And let me emphasize, that is some serious crap. Pulled straight out of your ass. Honestly.

This also generously assumes whole genomes can be built one mutation at a time, and that traversing neutral or deleterious space is never needed.

I don’t care what it assumes, it’s nonsensical blather you are pulling out of your rectum.

I recommend you start pondering how to calculate functional information (as in a quantity, in bits) of the Lac-Operon.

How do we calculate it?

I expect sometimes it would be, but I have no way of quantifying that so I’m not factoring that in to these calculations.

What calculations? I see numbers, but they’re asserted, not calculated.

I am also not arguing that 80% of the genome is junk. I think some of it is junk and some of it fits a looser definition of function. But anyway let me know your thoughts and if you can spot any errors in my thinking. Thanks for the discussion so far : )

You are welcome, thank you too.
Rumraket on September 15, 2016 at 1:44 pm said:

… I say “at least” because our genomes have redundant backups for when one system fails, and these numbers don’t account for that.

What backups are those?
Joe Felsenstein on September 15, 2016 at 3:17 pm said:

JoeCoder: Estimates based on Haldane’s work tell us that given typical human population genetics, it takes about 10 million years just to get about 1600 beneficial mutations, or in my terms 1600 nucleotides of information. So it seems difficult for evolution to produce 3 billion * 20% = 600 million nucleotides of information in our genomes. Even when we grant the shorter generation times and higher fecundity rates of earlier animals improving the rate and strength of selection.

Having 20% of the genome strictly functional would also lead to an unbearable genetic load, leaving evolution unable to even preserve that much information.

No. Haldane’s “cost of natural selection” is OK for the case where each substitution is compensating for a deterioration in the environment, The cost is then the reduction in fitness from that.

But when advantageous mutations occur in the absence of an environmental change, there is no cost.

I discussed this a few years ago in my paper on the substitutional load in American Naturalist. Well, actually it was 45 years ago, in 1971. The whole point of the paper was to figure out in what sense Haldane’s “cost” was a cost.

If you see a problem with that argument, let me know where I went wrong.
JoeCoder on September 17, 2016 at 3:05 am said:

Dr. Felsenstein, it’s a pleasure to have you join our debate. I’ve been wanting to test ideas against someone with your knowledge to make sure I’m not in error.

I’ve now read and studied your 1971 paper. I really like the way that your formula 8 simplifies calculating the number of excess offspring needed, given the initial frequency and number of generations. Unlike Haldane’s work I was able to easily follow your calculations.

I agree with your thesis that in a constant environment the cost of substitution is reduced. All else being equal, an advantageous mutation by definition increases the number of offspring, thus providing more offspring to pay the cost. But if the offspring having the beneficial mutation compete with those that do not have it, the haves will take resources from the have-nots and thus increase their rate of death. Thus the cost of selection returns. Alternatively if they do not complete then the population size will increase. But a population can’t increase in size indefinitely and this competition will always come into play at some point. So I think even if the constant environment gives a temporary reduction in cost, this cost must still later be repaid.

But I’m not sure if this constant environment scenario would come into effect. Wouldn’t deleterious load have the same effect as a deteriorating environment, since new deleterious mutations continuously turn a friendly environment hostile? Given the high deleterious rates we are discussing here, I would think that a constant environment would never exist. In your formula 9 an excessive deleterious rate would also make the excess offspring (d) zero, making the time to fix a beneficial mutation (k) infinite. I expect this strains the formula beyond its intent, but this would likely make even our 1600 beneficial mutations too generous.

It’s also possible there are other factors I’m not taking into account, or that I’ve misunderstood or miscalculated something.

Given realistic parameters, what do you think is a realistic rate of the rate of beneficial mutations? I posed this question to you and Dr. Moran a couple years ago and Moran guessed “a few thousand” in 6 million years of human evolution, a number not that far from Haldane’s.

I also wonder if an iterative simulation would be a better way to answer this question. Has anyone done this? If not I do have some familiarity with Mendel’s Accountant and could try it there. I’m not sure what you think of the program, but I’m a programmer by trade and I might be able to modify it, if it can’t reproduce the parameters that you would like to see.

Whether I’m right or wrong, thank you for introducing me to several new concepts in population genetics I had not yet encountered.
JoeCoder on September 17, 2016 at 3:24 am said:

Hello Rumraket,

Sorry it took me more than a day to get back to you. In my responses I give priority to members of the national academy of sciences, and I needed to first have time to read Dr. Felsenstein’s paper.

In the Lac-operon, count the number of nucleotides that reduce or disable its function if they are substituted. If you want the information count in bits, multiply this number by two, since each nucleotide is two bits. If one substitution leads to a reduction in function, two bits of information are destroyed. If that mutation later reverts, two bits of information are created. If a mutation destroys the whole thing, all those bits of information are destroyed.

It’s perfectly fine to use other definitions of information, but this definition is needed here because it is the same metric we use when we calculate the effects of deleterious mutations.

The problem with the 20% functional is two fold:

(1) Saying that none of that function arose since a chimp divergence only puts the problem elsewhere. When did it arise? Genome comparison studies have us sharing like 3-5% of our DNA with mammals in other orders.

(2) 20% function gives us about 20 deleterious mutations per generation. This is an unbearable genetic load. Deleterious mutations arrive faster than selection can remove them until we go extinct.

The 10% function estimates come from DNA conservation studies. These require taking evolution by common descent as a premise. Therefore it’s circular to use them to get the conclusion that <10% is functional and evolution is fine.

Given that Dr. Felsenstein’s paper is about the cost of substitution and calculating how much of a cost is too great, I hope you can now agree that I’m not making things up when I talk of selection having a budget 😛

The backups are redundant gene networks that kick in when primary networks are disabled. ENCODE reports: “Loss-of-function tests can also be buffered by functional redundancy, such that double or triple disruptions are required for a phenotypic consequence.” If you want to know more, see this talk by Dennis Noble. Start at 16:26 and watch about four minutes.
Joe Felsenstein on September 17, 2016 at 5:40 am said:

JoeCoder:

…

But if the offspring having the beneficial mutation compete with those that do not have it, the haves will take resources from the have-nots and thus increase their rate of death.Thus the cost of selection returns.Alternatively if they do not complete then the population size will increase.But a population can’t increase in size indefinitely and this competition will always come into play at some point.So I think even if the constant environment gives a temporary reduction in cost, this cost must still later be repaid.

I disagree. The whole structure of my argument is based on what happens when the population density is low. The cost of natural selection will be too high if the population cannot grow when at low density. Population density regulation will of course affect what density the population achieves once it can grow. But a population having too great a cost of natural selection will show that by its inability to grow back to a higher density when at low density.

But I’m not sure if this constant environment scenario would come into effect.Wouldn’t deleterious load have the same effect as a deteriorating environment, since new deleterious mutations continuously turn a friendly environment hostile?

An interesting point. Yes, but then we would have a starting gene frequency (for the more fit allele) that is very close to 1, not a low starting frequency. If every generation a new deleterious mutant arrives, at frequency $u$ , then $p_0 = 1-u$ so the condition in my paper $p_0(1+d)^k > 1$ would then be $(1-u)(1+d) > 1$ . Then you need $d > 1/(1-u)-1$ which will be $u/(1-u)$ , close to the usual calculation of the mutational load.

Given the high deleterious rates we are discussing here, I would think that a constant environment would never exist.In your formula 9 an excessive deleterious rate would also make the excess offspring (d) zero, making the time to fix a beneficial mutation (k) infinite.I expect this strains the formula beyond its intent, but this would likely make even our 1600 beneficial mutations too generous.

I am not following that paragraph. The usual mutational load calculations apply. Whether the population can tolerate a high mutational load depends on the reproductive excess $d$ . In any case we are talking about deleterious mutants that, in the deterministic theory in my 1971 paper, are not getting fixed.

Given realistic parameters, what do you think is a realistic rate of the rate of beneficial mutations?I posed this question to you and Dr. Moran a couple years ago and Moran guessed “a few thousand” in 6 million years of human evolution, a number not that far from Haldane’s.

That may not be way off, but you have to realize that much of the variation that is being acted on by selection would not be new mutations, but would be existing alleles floating around in the population.

I also wonder if an iterative simulation would be a better way to answer this question.Has anyone done this?If not I do have some familiarity with Mendel’s Accountant and could try it there.I’m not sure what you think of the program, but I’m a programmer by trade and I might be able to modify it, if it can’t reproduce the parameters that you would like to see.

I once tried to download that program but the site was very balky and I gave up. Mostly I wanted to see what was assumed about linkage, as if the chromosomes have no recombination within them, one could get a massive Muller’s Ratchet effect. I have not enough understanding of what they assumed. In downloading their program I was really looking for the documentation, and never got it. Did they make source code publicly available? Would be kind of hard to modify if they didn’t.

And note that I am disagreeing here with the notion that beneficial mutations impose a load.
Rumraket on September 17, 2016 at 4:32 pm said:

JoeCoder:
Hello Rumraket,

Sorry it took me more than a day to get back to you. In my responses I give priority to members of the national academy of sciences, and I needed to first have time to read Dr. Felsenstein’s paper.

LOL. Don’t let it get to your head that THE Joseph Felsenstein responds to your post to correct you.

I wouldn’t put that on my CV.

In the Lac-operon, count the number of nucleotides that reduce or disable its function if they are substituted. If you want the information count in bits, multiply this number by two, since each nucleotide is two bits. If one substitution leads to a reduction in function, two bits of information are destroyed. If that mutation later reverts, two bits of information are created. If a mutation destroys the whole thing, all those bits of information are destroyed.

Then your definiton of functional information is practically useless because it requires knowledge that’s practically impossible to obtain.

Literally the number of mutations that would have to be measured for their effect is practically impossible. Pick an arbitrary nucleotide, it’s got 3 other possible ones, it can also be deleted. And then there’s insertions, which are insertions of anything from a single nucleotide to, well another copy of the entire genome.

The Lac-Operon is thousands of nucleotides long. There’s a practically infinite amount of ways to mutate any given string when you allow all types of mutations.

You delete a nucleotide, functions is degraded ore lost. You delete two, it might be regained. You delete another one, loss again, transition nearby, regained. Etc. etc.

This is work that would take geologic eons to complete for single species.

What use is this definition then? Where’s the punchline? It is an absurdity to think this is the kind of information you would require to accept the possibility of evolution.
I dare say there is no other subject in all of science, indeed in your life at all, you put to the same standard of detail when you accept it’s postulates.

It would be analogous to rejecting a tale that a man has walked across Russia because it didn’t detail which particular straw of grass was stepped on every step of the way.
It betrays a hypocritical double standard to demand something like this.

It’s perfectly fine to use other definitions of information, but this definition is needed here because it is the same metric we use when we calculate the effects of deleterious mutations.

What do you mean “the same definition is used to calculate the effects of deleterious mutations”? I don’t think the definition you have detailed here has ever been used for anything. Certainly, noone has ever calculated the amount of functional information in anything by this method.

The problem with the 20% functional is two fold:

(1) Saying that none of that function arose since a chimp divergence only puts the problem elsewhere.

Who says that “none of that function” arose since the chimp divergence?

When did it arise?

If you’re talking about all the functions found in the human genome, then gradually over the entire history of life. Many of the basic cellular functions go back to bacteria over 3.5 billion years ago. Most of it, however, seems to have evolved from and since sometime around the origin of multicellularity.

Genome comparison studies have us sharing like 3-5% of our DNA with mammals in other orders.

Citation please.

(2) 20% function gives us about 20 deleterious mutations per generation. This is an unbearable genetic load. Deleterious mutations arrive faster than selection can remove them until we go extinct.

Suppose that’s high enough that we should have gone extinct a few million years ago, that would really just indicate the actual proportion of the genome that’s functional isn’t 20%, but less.

There are population geneticists right now that argue the genetic load for humans, due to modern medicine, IS too high, and we’re heading for some sort of meltdown.

The 10% function estimates come from DNA conservation studies. These require taking evolution by common descent as a premise. Therefore it’s circular to use them to get the conclusion that <10% is functional and evolution is fine.

Wait, wait, wait. Above you refer to a paper where they use conservation of RNA secondary structure as indicative of function. That means they do sequence conservation studies and calculate RNA secondary structure from primary sequence.
You write: “Third, when conserved RNA structure is taken into account as well as DNA, we find that 20% is conserved, as a lower bound: “20% of the human genome is subjected to evolutionary selection.” Assuming I’m following their work correctly.”
In other words, you assume common descent for that argument to work. So which one is it?
It is fine to use inter-species sequence comparison to probe the fraction of the genome that is functional, if the number you get implies too great a mutational load for common descent to be true, but NOT okay to use inter-species sequence comparisons to argue common descent is true?.

So your central contention is common descent is false because you have calculated the human genome to be 20% functional and this would incur too great a mutational load, meaning natural selection would not be able to get rid of all the deleterious mutations that would accumulate in a genome 3 billion base-pair genome that’s 20% functional. Right?

At a basic level, ignoring whether your figure is correct, that raises a lot of questions. Such as, why are the human and chimp genomes so similar? Why does this fossil series exist? (See picture). Why is the gorilla genome less similar, and the orangutan even less so, and so on? At some point, you can have so much information that indicates common descent, that if you can come up with a calculation that contradicts it, it is more probable that you have made a mistaken assumption or calculation error than it is that common descent is false.

Regardless, let’s take a look at that 20% figure again you made earlier:

First, genome wide association studies. Figure S1B from this paper estimates that only “only 4.9% of GWAS SNPs lie in coding sequence”.

If 2% of the genome is protein coding and if 50% of nucleotides within coding sequences are functional, then we can extrapolate total function.

No, we really can’t. LOL.

Sorry I didn’t spot this before, but no you really can’t just extrapolate the number of nucleotides from GWAS which fall inside and outside coding regions, that are implicated in trait or disease, into an estimation for the fraction of function for the rest of the genome.

What a GWAS like this instead indicates, is that most of our traits owe to non-coding DNA changes (changes in regulatory regions). Not how much of that non-coding DNA is functional. Like, at all. Basically that study just reaffirms the general evo-devo picture that most of our body form and function has evolved by evolution in developmental pathways, rather than changes in protein coding sequence.

But for your extrapolation to be valid this would require that large fractions of non-coding regions works pretty much like a coding sequence does. Yet we know that it really really doesn’t.
Literally nothing merits this extrapolation, and almost everything we know from genetics argues the opposite.

A GWAS can detect nucleotide substitutions that are implicated in disease, but a disease caused by a mutation that does not entail the nucleotide is from an otherwise functional piece of DNA. It is entirely possible that particular piece of DNA is normally nonfunctional, until it mutates and starts causing disease. For example a mutation that causes causes inteference with normal gene expression because it accidentally causes a noncoding sequence that sits in some long-dead and degraded retrotransposon, to look like a regulatory region of some sort.

Besides, I’m fine with saying as much as 100% of nucleotides in a coding sequence are functional, since the fundamental flaw in the argument you are making, lies with extrapolating function from the distribution of disease or trait-influencing mutations in GWAS, that lie inside and outside coding regions.
There’s another issue here, one raised by you bringing up that RNA secondary structure paper. RNA secondary structures are considerably more tolerant to mutations than protein coding regions. Meaning the overall fold is maintained even when sequence-similarity becomes low. Which would entail a much higher toleration of mutations in functional regions that are still functional as long as RNA secondary structure is maintained. This would imply, even if 20% of the genome was functional, that the mutational load on the RNA-structure conserving regions would be less than for other regions, meaning you’d have to treat them differently for a mutational load argument to work there.

That gives us 2% * 50% / 4.9% = 20.4% of the genome being strictly functional. I say “at least” because our genomes have redundant backups for when one system fails, and these numbers don’t account for that.
(…) The backups are redundant gene networks that kick in when primary networks are disabled.ENCODE reports: “Loss-of-function tests can also be buffered by functional redundancy, such that double or triple disruptions are required for a phenotypic consequence.”If you want to know more, see this talk by Dennis Noble.Start at 16:26

Right, so you bring a nice little example of something that has a level of redundancy incoporated. Yet where is the evidence that this redundancy comes from elsewhere in the genome, rather than already known to be functional regions already implicated in contributing function to heart rhythm?

Denis noble does not mention anything at all about what is causing the redundancy in this function. There’s nothing there about whether hitherto unexplored regions of the genome somehow sets into action and take over function for the knocked out gene. For all we know, this redundancy is in that already known to be functional ~10% of the genome.

This is extremely poor evidence for the idea that at least 20% of the genome is functional due to functional redundancy.
Denis mentions yeast knock-out experiments. Yeast have a tiny genome of roughly 12 megabases, they’re practically like bacteria with pretty much zero junk-DNA and a much, much smaller fraction of their genome being non-coding, than for example primates. I’m fine with a 100% functional yeast-genome.

Given that Dr. Felsenstein’s paper is about the cost of substitution and calculating how much of a cost is too great, I hope you can now agree that I’m not making things up when I talk of selection having a budget

Right, I concede I had no idea what you were talking about. So there’s probably somewhere around 2000 beneficial mutations that has happened since the chimp-human divergence. Now that I’ve read about it, what’s the issue again?
JoeCoder on September 18, 2016 at 5:25 am said:

Dr. Felsenstein,

I 100% agree that beneficial mutations do not impose a load. I remember reading that idea in ReMine’s paper and that immediately stuck out as incorrect.

However, I apologize that I don’t follow your paragraph about low population density. I think the simplest case to model is a constant environment. In that case the population either increases in response to a beneficial mutation (thus increasing density) or remains constant due to competition. How did we get a scenario with low population density? I’m missing something here.

we would have a starting gene frequency (for the more fit allele) that is very close to 1, not a low starting frequency

If the deleterious mutation rate is even as high as 1, then the frequency of all those deleterious alleles would also be 1. Above I am arguing that the deleterious rate may be as high as 20 per generation. I realize a value this high is a major point of contention and it may be useful for us to discuss that. But are you saying that the deleterious mutation rate must be less than 1? What do you think is the correct value?

If every generation a new deleterious mutant arrives, at frequency u, then p_0 = 1-u so the condition in my paper p_0(1+d)^k > 1 would then be (1-u)(1+d) > 1

That means that if the deleterious mutation rate (u) is even as high as 1, then the inequality (1-u)(1+d) > 1 will never be satisfied, no matter what value we use for d. Or maybe I’m missing something obvious?

On the paragraph on high deleterious rates that you didn’t follow: I am arguing that the deleterious rate is high enough that there is no reproductive excess (d).

I’ve had better luck with newer versions of Mendel’s Accountant. The UI is still a little clunky since the program runs as a local web server and the interface as a browser app. The program can be downloaded here (click Windows or Mac), with source code in this .tgz archive. The user manual is displayed within the application when you click the help tab. For your convenience I’ve uploaded the Mendel manual here.

I fully agree that having no recombination would be “a massive Muller’s Ratchet effect” and highly unrealistic. however on the population tab of Mendel’s Accountant, it defaults to 23 chromosomes with 1000 for the “number of linkage subunits.” On this value the user manual states:

“the number of linkage blocks should be an integer multiple of the number of chromosome (e.g. the default value of 989 is 43 times the default 23 chromosomes). MENDEL will automatically adjust to the nearest integer multiple (e.g. if you input 1000 and 23 chromosomes, MENDEL will use a value of 989). The number of linkage blocks is evenly distributed over a user-specified haploid number of chromosomes (default=23). We also offer the researcher the option (turn off “dynamic linkage”) of a simpler model involving the specification of a fixed number of linkage blocks and fully randomized recombination between all linkage blocks each generation (no chromosome number is required).”

Is this what you’re looking for?
JoeCoder on September 18, 2016 at 5:26 am said:

I also did three runs of Mendel with the default parameters, except increasing the offspring per female from 4 to 6 (reproductive rate =3), a sub-population size of 1000, and 1000 generations. Larger population sizes take a lot longer to run. The manual linked above shows the default options for all of the other parameters.

Simulation 1: Non-neutral mutation rate = 1, with 1% of those mutations being beneficial. After 1000 generations there were 9 beneficial mutations that fixed. This is about 2.7 times faster than Haldane’s estimated rate.

http://i.imgur.com/Y7tQpbd.png

Simulation 2: Non-neutral mutation rate = 1, with 0.1% of those being beneficial. After 1000 generations, 0.9 beneficial mutations that fixed. About 3.7 times slower than Haldane’s rate.

http://i.imgur.com/9HUzepG.png

Simulation 3: Non-neutral mutation rate = 20, with 0.1% of those being beneficial. After 1000 generations, 11.7 beneficial mutations that fixed. About 3.5 times faster than Haldane’s rate.

http://i.imgur.com/gi8iSFC.png

At first I was surprised that simulation 3 had a higher rate, but I immediately realized that having 20 times more mutations also meant 20 times more beneficial mutations. Even given the rate the deleterious load rockets upward.

If you’re not able to get Mendel to run, then I can do additional runs if you can tell me the parameters you’d like to see.
JoeCoder on September 18, 2016 at 5:28 am said:

Rumraket, I ran out of time tonight so a response to your points will have to wait. You have my apologies. I respond to Dr. Felsensein first because he is an authority in these matters and I therefore consider his feedback more valuable. I hope you don’t mind.
Rumraket on September 18, 2016 at 7:44 am said:

JoeCoder:
Rumraket, I ran out of time tonight so a response to your points will have to wait.You have my apologies.I respond to Dr. Felsensein first because he is an authority in these matters and I therefore consider his feedback more valuable.I hope you don’t mind.

Take all the time you need kiddo.
petrushka on September 18, 2016 at 1:31 pm said:

https://www.ucdavis.edu/news/essential-mouse-genes-could-guide-human-precision-medicine

One third of mouse genes studied were so critical that mutations caused death at the embryo stage. This seems consistent with estimates of spontaneous abortion in humans.

I’m wondering how these numbers affect genetic load. My naïve impression is that one third of mutations to functional DNA can never become fixed.
Mung on September 18, 2016 at 2:11 pm said:

JoeCoder: I 100% agree that beneficial mutations do not impose a load. I remember reading that idea in ReMine’s paper and that immediately stuck out as incorrect.

You probably misunderstood his paper. Remember, his paper was peer reviewed by the best. [ Not meant to be a knock on Joe. 😉 ]
Mung on September 18, 2016 at 2:21 pm said:

JoeCoder: The program can be downloaded here (click Windows or Mac), with source code in this .tgz archive. The user manual is displayed within the application when you click the help tab. For your convenience I’ve uploaded the Mendel manual here.

Thank you JoeCoder!
Mung on September 18, 2016 at 2:24 pm said:

petrushka: One third of mouse genes studied were so critical that mutations caused death at the embryo stage.

There’s been some obvious latching going on there!
Rumraket on September 18, 2016 at 4:44 pm said:

petrushka: https://www.ucdavis.edu/news/essential-mouse-genes-could-guide-human-precision-medicine

One third of mouse genes studied were so critical that mutations caused death at the embryo stage. This seems consistent with estimates of spontaneous abortion in humans.

I’m wondering how these numbers affect genetic load. My naïve impression is that one third of mutations to functional DNA can never become fixed.

Well, it does bring up an interesting question I think, which is how deleterious the average mutation in a functional protein coding sequence is, versus how deleterious the average mutation in a functional but non-coding sequence is.

In the article you link they tested mutations in protein-coding sequences. My suspicion would be that you’d get much higher tolerance of mutation in functional non-coding regions, such as target-sequences for DNA binding proteins such as enhancer and promoter sequences, for the same reason that a phenomenon like pervasive transcription takes place.

In other words, the constraints operating on coding sequences are much more stringent than those operating on non-coding sequences. But that brings up another point the long-term effects of which are not intuitively obvious to me, which is that even though certain genome regions might be more tolerant to mutations, eventually given enough time, you’d still end up with too many deleterious mutations in them if there are not enough beneficial ones to offset the effect.
JoeCoder on September 19, 2016 at 2:59 am said:

Rumraket, I’d like to now respond to your longer post #8 above. This is long, so please point it out to me if I’ve missed or misunderstood any of your points. I’m sure you will 🙂

I don’t know why you think it’s “practically impossible to obtain” the number of nucleotides in a gene that are sensitive to mutation? Studies measure this all the time. Five examples: In fly exons: “the average proportion of deleterious amino acid polymorphisms in samples is ≈70%”. In human exons: “19% of [amino acid altering] mutations are effectively neutral.” humans again: “at most 30% of non-synonymous mutations are neutral”. In the bacterial beta lactamase gene: “about one in four random single-residue changes are functionally neutral”. Ribosomal proteins in Salmonella: “most mutations (120 out of 126) are weakly deleterious and the remaining ones are potentially neutral.” Substitution mutations are much more common than all the kinds, and non-3n indels are almost always deleterious, so including your suggested permutations wouldn’t have much effect on these numbers. I’ve actually never heard someone argue that this can’t be calculated.

It’s very common in the literature to use “functional” and “subject to deleterious mutations” interchangeably. Ask any of the credentialed evolutionists who comment here if you don’t believe me. This is my definition as well, which I also call “information.”

On 3 to 5% of DNA shared, see figure 2 of this conservation study. Homo-canis, homo bos, homo mus, mus-equus, mus-canis, and mus-bos are all around 100 to 150Mb, which is 3-5% of the genome. Homo-equus is a little higher at 150 to 200MB, or 5 to 7%. However this dose not measure sequences that share RNA structure between these species. I don’t have numbers for that.

Most of [the function] seems to have evolved from and since sometime around the origin of multicellularity.

Let’s do a crude summation of how much function could be obtained in 700m years, given our numbers so far. Using my third run of Mendel’s accountant, that gives us 11.7 beneficial mutations every 1000 generations. If we grant 700 million generations in 700 million years, that comes to 8.2 million nucleotides of functional genome. That’s only 0.27% of the genome!

Here is one reason I think I am not being generous enough with this estimate:

(1) The population genetics of sponges and fish makes selection more efficient among them than it is in humans or mammals, and would increase the rate of beneficial fixation.

But here are four reasons I think I am being too generous:

(1.) A mutation rate of 20 would make us extinct a thousand times over.
(2.) I expect that 1 in 1000 mutations being beneficial is too generous by at least a couple orders of magnitude, or more.
(3.) It assumes every system can be built by successive mutations. Neutral and deleterious space never have to be traversed.
(4.) Simpler organisms have smaller genomes and thus a lower mutation rate.

So I overall this is one reason I do not find it compelling that evolution could have created us.

A better estimate would do multiple simulations using the differing population genetics at various points in evolutionary history. I hope to do that eventually, but that’s beyond the time I plan to invest in this debate.

if the number you get implies too great a mutational load for common descent to be true, but NOT okay to use inter-species sequence comparisons to argue common descent is true?

That’s right. Sometimes it make sense to temporarily grant the assumptions of a theory in order to show that it is not internally consistent. That is the approach I am taking here. But even if you accept theistic evolution or even reject common descent entirely, shared sequences can still give a lower bound for function. If they were not functional then neutral mutations (if given enough time) would make them non-identical.

It is entirely possible that particular piece of DNA is normally nonfunctional, until it mutates and starts causing disease.

I would expect such cases are exceedingly rare. Rare enough that they would make no discernible difference in my calculations. Do you know of any data otherwise? The junk DNA proponents already argue that 85% of DNA is already being spuriously transcribed without us all being genetic misfits.

RNA secondary structures are considerably more tolerant to mutations than protein coding regions.

Certainly. Above I assumed that 2% of the genome is protein coding. If RNA was just as sensitive to mutations, then 98% of deleterious mutations would be outside of protein coding regions. It’s only 95.1%.

even if 20% of the genome was functional, that the mutational load on the RNA-structure conserving regions would be less than for other regions

Taking the RNA structures that constitute part of the 20%: If they could tolerate lots of mutations then they would no longer stay similar in different species. That would make them not show up within that 20%.

This is extremely poor evidence for the idea that at least 20% of the genome is functional due to functional redundancy.

Somewhere I’ve communicated poorly. My 20% estimate has nothing to do with redundancy. I was only saying that redundancy would increase it beyond 20% by an unknown amount:

where is the evidence that this redundancy comes from elsewhere in the genome, rather than already known to be functional regions

Above I estimated that 50% of nucleotides that code for gene are functional / subject to deleterious mutations. This comes from taking the lower bound of 70% of amino acid altering mutations being deleterious, times 70% of mutations being amino acid-altering. Plus another 1% for synonymous mutations makes 50%, although I expect would be higher. This estimate comes from genes that are known to be functional. But the genome wide association studies will likely not find redundant functional regions in non-coding DNA, because if they are knocked out then another system compensates. So this is why redundant sequences increases the estimate.

Given the assumptions of random mutation + selection that conservation studies are built on, redundant sequences shouldn’t even exist because there is little to no selection to maintain them. So I’m not sure how to factor their presence into the 20% conserved estimate.

“So your central contention is common descent is false because… too great a mutational load…” -> Yes, although my argument is only against unguided (unintelligent) mutation+selection being able to create or even preserve our genomes. You could be a theistic evolutionist, invoke many miracles, and still accept every part of my argument.

It wasn’t my intent to argue against common descent here. But I actually do reject common descent, so I will give you my perspective on each of your points in a following post.
JoeCoder on September 19, 2016 at 3:03 am said:

Rumraket, on common descent I’ll first discuss general trends in the fossil record and then talk specifically about hominids.

Don Prothero says “Nearly all metazoans [meaning animals] show stasis, with almost no good examples of gradual evolution.” Doug Erwin says that the further we ascend the taxonomic hierarchy (species -> genus -> family -> order, etc.), the fewer intermediates there are:

“Darwin and the proponents of the Modern Synthesesis expected insensible gradiation of form from one species to the next, this is only sometimes found among extant species (for example, among cryptic species) and is rare in the fossil record. Gradiations in form are even less common at higher levels of the Linnean taxonomic hierarchy… In the past non-paleontologists have attempted to rescue uniformitarian explanations by ‘explaining away’ this empirical pattern as a result of various biases.”

This clumpiness is the opposite of what gradual evolution predicts, and it can’t be explained away by appealing to sampling biases (missing fossils). If evolutionary theory is true, we should expect MORE intermediates as the hierarchy is ascended, because there are wider gaps to fill. But instead we see less.

Let me try to quantify this. Suppose we could assign a single number to represent the differences between organisms. The distance between sapiens, neanderthals, and erectus is perhaps 10 points, with enough intermediates in between to reduce each step to 1 or 2 points. I agree they all share a common ancestor. The distance between tiktaalik and other fishapods is more like 50 points on our scale, not that those fossils have meaningful order. If you want to propose a common ancestors for cephalapods, arthropods, nematodes, and chordates, you can propose a sponge here or a hallucigenia there as intermediates, but I’d expect the differences are now something like 200-500 points on this scale. Someone name something else if we have better candidates? I’m not a fossil guy and I’m only discussing my view because Rumraket asked.

But we see a similar pattern if we try to build a family tree of designed objects. The end branches are full of similar species–hundreds of models of iPhones and androids all cluster tightly together. But if you go back further, what’s the common ancestor of an iPhone and a dell computer? What about an iPhone and a Prius? Just like the fossil record our gaps get bigger.

Among hominins, I think just about everything in the genus homo shares a common ancestor. Neanderthal nuclear DNA even groups within the variation seen in sapien DNA.

As for assembling fossils into a sequence like your image, I’m not convinced the signal overpowers the noise of convergence. For example I can arrange modern canine skulls into an even better sequence, but they’re all the same species. I could even throw a marsupial wolf at it would fit right in, but that’s a completely different order of mamallia. You would only know by comparing several minor features (e.g. extra molars, hole in palate) that group it with other marsupials. Canines share more genetics with bats, whales, and humans than they do marsupial wolves.

Or I could put together a sequence of aardvarks, anteaters and pangolins. We actually thought they were very closely related until genetics put them in different orders.

On genetics, humans and chimpanzees share about 95 to 96% of their DNA. But for 23% of that DNA, it is more similar to other apes than it is to Chimpanzees. If you assume evolution you can argue that this is from incomplete lineage sorting, but discordant gene trees are also what you’d get if you attempted to build a phylogeny of designed objects. So I don’t find it compelling that this pattern differentiates between common descent and common design.

See figure 2 from this paper for more examples if discordance. It seems to extend to all levels of the taxonomic hierarchy.

Hopefully I have not made any errors here. I know less about fossils and phylogeny than I do about population genetics.
JoeCoder on September 19, 2016 at 3:08 am said:

Mung, It may be possible that I’ve misread ReMine. I wonder if some of his reviewers were reluctant to publish because he accused so many others in the field of misrepresentation and deliberate obfuscation. Having a little tact can go a long way.
Joe Felsenstein on September 19, 2016 at 6:25 am said:

JoeCoder,

Thanks for describing the results of your runs. However some details are missing:

Simulation 1: Non-neutral mutation rate = 1, with 1% of those mutations being beneficial. After 1000 generations there were 9 beneficial mutations that fixed. This is about 2.7 times faster than Haldane’s estimated rate.

What was the selection coefficient favoring those beneficial mutrants? For unlinked loci I can use that to calculate what we might expect.

Simulation 2: Non-neutral mutation rate = 1, with 0.1% of those being beneficial. After 1000 generations, 0.9 beneficial mutations that fixed. About 3.7 times slower than Haldane’s rate.

Same questions.

Simulation 3: Non-neutral mutation rate = 20, with 0.1% of those being beneficial. After 1000 generations, 11.7 beneficial mutations that fixed. About 3.5 times faster than Haldane’s rate.

Same questions.

So … why wasn’t Haldane’s cost stopping those mutations from fixing? Or killing off the species?
Joe Felsenstein on September 19, 2016 at 6:32 am said:

JoeCoder: On the paragraph on high deleterious rates that you didn’t follow: I am arguing that the deleterious rate is high enough that there is no reproductive excess (d).

Using the usual mutational load calculation, if the total number of mutations per haploid genome was 1 per generation, the mutational load should be $1 - e^{-1}$ or about 0,63. The reproductive excess needed to survive this would be about $1.718$ ,
Joe Felsenstein on September 19, 2016 at 6:41 am said:

Mung: JoeCoder: I 100% agree that beneficial mutations do not impose a load. I remember reading that idea in ReMine’s paper and that immediately stuck out as incorrect.

You probably misunderstood his paper. Remember, his paper was peer reviewed by the best. [ Not meant to be a knock on Joe. ]

I reviewed Remine’s paper for Journal of Theoretical Biology. But after it was rejected from there and he put it into Creation Research Society Quarterly, he did not do revisions that reflected my review. So the version that appeared there is not “reviewed by the best” (assuming that I am the best, which on this issue is a fair assumption).

It is worth noting that Remine’s “cost” is a consequence of substitutions, not something that slows down fixation. It is a very different thing from the Haldane cost.
JoeCoder on September 19, 2016 at 5:42 pm said:

Dr. Felsenstein,

Thank you for taking the time to review this. Hopefully I can help move us along further:

What was the selection coefficient favoring those beneficial mutrants?

The selection coefficients follow a Weibull distribution. My screenshots above showed the parameters I changed on Mendel’s Basic tab. Here is a screenshot of the other tabs, all of which I left at their default values:

http://i.imgur.com/h2zniur.png

Section 3.2 of this paper from Sanford describes how these parameters are used to create the Weibull distribution. The maximum beneficial effect defaults to 0.01. You’re looking for an average value but I’m not sure how to get it from this distribution.

To gen an idea of what this distribution looks like, in Mendel I made every mutation beneficial and set heritability down to 0.001. I got this plot of accumulated beneficial mutations:

http://i.imgur.com/Va2oWeD.png

It looks like the average beneficial selection coefficients range from 10^-3 to 5×10^-9 and average around 5×10^-5 or 10^-6 .

why wasn’t Haldane’s cost stopping those mutations from fixing?

Above I misspoke: For “X beneficial mutations that fixed” I should have said “average beneficial mutations per individual”. This is also how the result is reported in Mendel. The distinction is that different lineages have accumulated different beneficial mutations.

My simulation 3 has a mutation rate of 20 with 0.1% of those mutations being beneficial. That gives every lineage gets a beneficial mutation once every 50 generations. In 1000 generations we should expect 1000/50/2 = 10 such mutations to accumulate in every lineage even if they were completely neutral. So I think about 10 of those 11.7 mutations are accumulating by neutral evolution alone.

What about the other 1.7? I expect our population is buying beer (beneficial mutations) before they pay the rent (removing deleterious mutations). It’s just as well since at u=20 they can’t afford the rent anyway.

Or killing off the species?

On Mendel’s Computation tab, the value for “go extinct when mean fitness reaches:” defaults to 0. With u=20, after 1000 generations the fitness had only declined from 1.0 to 0.48.
JoeCoder on September 19, 2016 at 5:43 pm said:

Dr. Felsenstein,

1 per generation… reproductive excess needed to survive this would be about 1.718,

I remember Dan Graur calculating this with a del. mutation rate of 80 (from ENCODE’s 80%) and getting some ridiculously huge number. But even at u=20 we get each woman needing almost a billion offspring. Is there a functional genome size that is realistic and still small enough to make this work? I read Dr. Moran’s blog frequently so I’ve seen many of the arguments.

However I do think the minimum_offspring = 2e^u formula is too harsh because it doesn’t take recombination into account. Suppose mutations accumulate to the point where mom and dad each have 100,000 del. mutations. If there are 10 offspring, 25% of them will inherit only 99,900 del. mutations. This can be modeled with a binomial distribution. Then with ~20 new mutations those offspring still have less than mom and dad.

But according to Sandford this solution only solves the problem if every del. mutation has an equal effect. I find this intuitive, because in the real world selection eliminates those with lowest fitness, rather than those with the highest count of deleterious mutations.

Let me know if I have any errors in my thinking. And thanks for your time so far.
petrushka on September 19, 2016 at 6:33 pm said:

I have to wonder if Mendel’s Account models anything in the rel world.

Has any species or any lineage gone extinct due to accumulation of deleterious mutations?
JoeCoder on September 19, 2016 at 6:59 pm said:

petrushka,

Mutagenesis has been used to drive some viral lineages to extinction, but I was curious if load has led to the extinction of anything more like us. I googled and found this paper:

“using previously published estimates of inbreeding in Neanderthals, and of the distribution of fitness effects from human protein coding genes, we show that the average Neanderthal would have had at least 40% lower fitness than the average human due to higher levels of inbreeding and an increased mutational load… Neanderthals have a relatively high ratio of nonsynonymous (NS) to synonymous (S) variation within proteins, indicating that they probably accumulated deleterious NS variation at a faster rate than humans do. It is an open question whether archaic hominins’ deleterious mutation load contributed to their decline and extinction.”

Neanderthals are indeed extinct. But I haven’t read the paper beyond the abstract and a skim of the intro. There may be better examples, but I haven’t looked.

Deleterious mutations accumulate the fastest in species with larger functional genomes, longer generation times, lower reproductive rates, and smaller populations. So these would be the most prone to extinction from del. mutations.

I have to wonder if Mendel’s Account models anything in the rel world.

If someone can show that it doesn’t in some regard, I’d like to see if I can modify its parameters or its source (if not too time consuming) to model what we need. I’ve read through quite a few criticisms of Mendel’s Accountant on various blogs and forums and found them wanting. But there’s a lot I don’t know so I’m interested ti see if there are valid criticisms.
petrushka on September 19, 2016 at 7:08 pm said:

I take that as a no.

The Neanderthals seem not to be entirely extinct, since they have living descendants.

Unless we are GMOs.
petrushka on September 19, 2016 at 7:32 pm said:

The Lenski experiment bottlenecked populations of bacteria for thousands of generations, with no diminishment of viability. Would you expect them eventually to go extinct?
JoeCoder on September 19, 2016 at 7:35 pm said:

Petrushka, can you name an animal species where deleterious load definitely did not contribute to their extinction? Even among those hunted to extinction by humans, perhaps with less deleterious load they would’ve had a higher reproduction rate and still be alive. The issue is we don’t have a good way of knowing.

Do you at least agree that perpetually increasing the number of deleterious mutations carried by a lineage will always lead to extinction?
JoeCoder on September 19, 2016 at 7:44 pm said:

petrushka,

The Lenski experiment

E coli get something like one mutation every 2000 cell divisions. Selection is also far stronger in microbes than in mammals. E coli likely have very little problem, if any, with the accumulation of deleterious mutations.

with no diminishment of viability

All 12 lineages lost their rbs operon. Some of them also lost genes related to “central metabolism (knockouts of pykF and nadR), cell wall synthesis (adjacent to the promoter of pbpA-rodA), and ill-defined functions (knockouts of hokB-sokB and yfcU)”
petrushka on September 19, 2016 at 7:49 pm said:

JoeCoder: Do you at least agree that perpetually increasing the number of deleterious mutations in a lineage will always lead to extinction?

The question is whether, in the last three of four billion years of continuous reproduction, that has ever happened.

I read that 99 percent of all known species are extinct. But most extinctions seem to result from rather massive and sudden changes in the environment.

Nevertheless, we are descended from a continuous, unbroken chain of reproduction. The tips of the tree sometimes die, and large branches are sometimes fall off due to win or lightening, but your meltdown doesn’t seem to happen.

I’ll ask one more time. Suppose you restart the Lenski experiment, only this time, for each generation, transfer only one cell to the new medium. What do you think will happen?
petrushka on September 19, 2016 at 7:56 pm said:

JoeCoder: E coli likely have very little problem, if any, with the accumulation of deleterious mutations.

Okay, name a lineage that went extinct due to genetic meltdown.
petrushka on September 19, 2016 at 8:15 pm said:

I might add that you picked for an example, a species of bacteria having an unusually low mutation rate per replication.
JoeCoder on September 19, 2016 at 8:16 pm said:

But most extinctions seem to result from rather massive and sudden changes in the environment… name a lineage that went extinct due to genetic meltdown

How do you differentiate that (or any other cause of extinction) from extinction by genetic load? As load increases, a species becomes more susceptible to extinction by predators, environment, disease, etc. Was JFK killed by an assassin or a bullet?

———

If the e coli are isolated after every generation then selection is non-existent and they probably would’ve accumulated more deleterious mutations than they did. Experiments like this will always lead to deleterious mutations accumulating faster than in the real world, because in the lab there is no selection to maintain the genes that are being lost.
JoeCoder on September 19, 2016 at 8:20 pm said:

you picked for an example, a species of bacteria having an unusually low mutation rate

petrushka, you are the one that picked e coli as an example by citing the LTEE, not me 😛
petrushka on September 19, 2016 at 8:26 pm said:

JoeCoder: How do you differentiate that (or any other cause of extinction) from extinction by genetic load? As load increases, a species becomes more susceptible to extinction by predators, environment, disease, etc.

People who worry about extinction have always worried about population size.

But meltdown doesn’t seem to be about lack of diversity. It seems to predict extinction in a stable environment.

Yes or no?
JoeCoder on September 19, 2016 at 8:32 pm said:

I would expect that as deleterious load increases, the population gradually decreases each time there is an environmental challenge. E.g. an unusually harsh winter or a temporary excess of predators. Then as the population decreases, inbreeding accelerates the fixation of deleteroius alleles, accelerating this process.

I think this can also happen in stable environments, but a fluctuating environment makes it more likely.
Joe Felsenstein on September 19, 2016 at 8:39 pm said:

We seem to have gone from a discussion of the Haldane cost calculation to a discussion of whether the mutation load is too big. Keep in mind:

a. Mutations per genome is not the same as deleterious mutations per genome, if a large fraction of the genome is junk DNA (as I would argue it is).

b. Humans may have atypical mutation rates. What would be more instructive would be measuring total mutation rates in nonhuman organisms (such as the deer mouse). Of course currently most genomics is on humans.

c. If most of our DNA is not junk, then not only are we dead in the long run, or even the medium run, we’re dead right now. It would be astonishing that you survived being born and trying to grow up.

d. I don’t understand JoeCoder’s argument about recombination. And I have worked a lot on the population genetics of recombination.

e. For the cost of natural selection, it would be more instructive to do a run with no deleterious or neutral mutants, and only one positive value of the selection coefficient. Then we could explore whether there was any sign of a “cost of natural selection” for beneficial mutations. (There won’t be, I predict).
keiths on September 19, 2016 at 8:41 pm said:

Joe,

c. If most of our DNA is not junk, then not only are we dead in the long run, or even the medium run, we’re dead right now. It would be astonishing that you survived being born and trying to grow up.

It’s a miracle. Therefore Jesus, heathen.
keiths on September 19, 2016 at 8:43 pm said:

JoeCoder:

If someone can show that it [Mendel’s Accountant] doesn’t in some regard, I’d like to see if I can modify its parameters or its source (if not too time consuming) to model what we need.

Do you know where to find the latest source code? There’s a source code directory at at Sourceforge, but it’s for version 1.5, which is six years old. A note there reads:

NOTE: This is an OLD distribution of MENDEL released in 2010.

This is only a source distribution of Mendel’s Accountant, which means the code must be compiled. This is the only release which contains the C-version of Mendel’s Accountant. Development of the C-version was discontinued due to the poor performance of C as compared with Fortran for doing numerical computations.
petrushka on September 19, 2016 at 8:46 pm said:

JoeCoder: I think this can also happen in stable environments, but a fluctuating environment makes it more likely.

I don’t see it happening in the world. So I have to assume mendel’s Account doesn’t model the real world.

ETA:P

It sounds like one of those bumblebees can’t fly things. Only MA seems to include an actual claim about how biological evolution works.
Joe Felsenstein on September 19, 2016 at 9:15 pm said:

I don’t want to try to get a program to model the real world. I want to look at situations with only a few kinds, or one kind, of mutation and see whether we can understand what is happening.
petrushka on September 19, 2016 at 9:19 pm said:

Joe Felsenstein: I don’t want to try to get a program to model the real world.

I assume because you realize that the real world is silly complex, and it doesn’t have mutations easily labeled as neutral or deleterious.
keiths on September 19, 2016 at 9:37 pm said:

Zachriel identified a serious bug in Mendel’s Accountant in 2009.

He notified the authors, but as of October of 2015 it still had not been fixed.
JoeCoder on September 19, 2016 at 9:46 pm said:

Dr. Felsenstein,

a. Certainly. In my first comment on this page, I made a case that at least 20% is functional. By functional I mean subject to deleterious mutations, not some looser definition of function. Mendel’s Accountant shows that if 10% of our genome is functional (u=10) then deleterious mutations accumulate at a rate of 4.5 per generation. That paper varies population size, models of selection, and number of linkage blocks, but without success. A del. mutation rate of 5 also led to decline, but a value of 1 did not so long as truncation selection was used:

“For a mutation rate of 1 per offspring on average, all mutations are selectively eliminated, so mutation accumulation is 0. With an average of 1 new mutation distributed in a Poisson manner and with four of every six offspring selectively eliminated, truncation selection is able to exclude every offspring that has one or more mutations.”

Although I think complete truncation selection is unrealistic.

b. If other mammals have a significantly lower mutation rate, I think there would then not be enough time to account for the neutral differences between our genomes? Ironically if we could find a rabbit in the precambrian it would then help evolution by giving us more time.

c. I think genetic redundancy does a lot to save us from deleterious mutations, at least temporarily From a recent paper: “Loss-of-function tests can also be buffered by functional redundancy, such that double or triple disruptions are required for a phenotypic consequence.”

d. I can be seriously lousy at communicating things. I’ll try again: If mom and dad both carry 100,000 deleterious mutations, do you agree that the offspring will not inherit exactly 50,000 del. mutations from both mom and dad? That instead they will inherit a number slightly more or less than this? If so, this is how some offspring can end up with fewer deleterious mutations than their parents.

My using the binomial distribution to model this might not be a good enough approximation because it doesn’t account for linkage.

e. I set Mendel to have 2 offspring per mother, so that there is 0 reproductive excess. Exactly one mutation per generation, which is has a huge 10% selection coefficient. After 1000 generations there were 1002.42 beneficial mutations per individual, which is very close to the neutral rate of fixation. So I think there is only cost to fix beneficial mutations at a rate faster than the neutral rate. Is this what you ware wanting to model, or have I misunderstood? I am wondering if this scenario (or my understanding of it) is too contrived to be of much use.
JoeCoder on September 19, 2016 at 9:52 pm said:

Zachriel identified a serious bug in Mendel’s Accountant in 2009.

keiths, A year ago I had a long discussion with Zachriel about his claim that Mendel modeled probability selection incorrectly. In javascript I wrote up a simple reproduction of the part he said was buggy, and it conformed to his expectations for how a correct implementation should behave. We ended with Zachriel telling me:

Looks like an interesting and valid test. Will take a look at it later, and let you know if it changes our position.

I haven’t heard anything further from him.
Patrick on September 19, 2016 at 9:55 pm said:

petrushka:
I have to wonder if Mendel’s Account models anything in the rel world.

Has any species or any lineage gone extinct due to accumulation of deleterious mutations?

Mendel’s Accountant has been discussed extensively here and at AtBC, among other sites. Zachriel summarized it succinctly there:

These are the primary attributes I’ve found in Mendel’s Accountant:

* Population of Genotypes (genotypic fitness).
* Genotype modified by heritability and noise to Phenotype (phenotypic fitness).
* Genotype further modified for chance of reproductive success to Working Fitness.
* Number of offspring proportional to sqrt(Phenotype).
* Reproduction with mutation.
* Throw in more random factors, such as random death.

> The calculation of Phenotype is not scaled.
> The calculation of Working Fitness is division by Randomnum. Not the sqrt(randomnum), not some other exponent. Or even a normal binomial to determine reproductive success.
> Number of offspring proportional to sqrt(Phenotype). Why the square root? Why not some other exponent?
> Random death? Isn’t that already accounted for in phenotypic fitness (for stillbirths) or chance of reproductive success?

You may as well throw in another factor that randomizes falling off a cliff and whether she has a headache, and just make up numbers or exponents or parameters that seem right. The problem is the qualitative nature of the simulation and the arbitrariness of some of the assumptions. I just don’t see Mendel’s Accountant being salvageable as a quantitative model of biology.
JoeCoder on September 19, 2016 at 9:55 pm said:

keiths,

Their sourceforge repository looks like it’s maintained by someone without much experience with source control. You can find version 2 inside the linux folder. Here is a direct link.
JoeCoder on September 19, 2016 at 9:58 pm said:

Patrick, I addressed all of this in the discussion I had with Zachriel a year ago. See my comment above directly before yours.