Genetic Algorithms: When Drift Overcomes Selection

Posted on February 7, 2016 by Mung

I often encounter posters here at TSZ who claim that Genetic Algorithms (GAs) either model or simulate evolution. They are never quite clear which it is, nor do they say what it means to model or simulate evolution (what would be required) and how GAs qualify as either one or the other. My position is that GAs neither model nor simulate evolution. In addition to other reasons I’ve given in the past I’d like to present the following argument.

GAs are often used to demonstrate “the power of cumulative selection.” Given small population sizes drift ought to dominate yet in GAs drift does not dominate. Why not?

Three questions:

How do we determine the effective population size for a GA?
How do we calculate the value of the selection coefficient?
How do we determine when genetic drift will overcome the effects of selection?

In a GA written by keiths (a version of the WEASEL program) the default population size is 200.

#define POPULATION_SIZE 200 // total population size

Effective population size is the number of individuals in a population who contribute offspring to the next generation.

Even though the population size is 200, only one is selected to contribute offspring to the next generation.

#define NUM_SURVIVORS 1 // number of survivors per generation (must be less than POP_SIZE)

Given an effective population size of one, drift ought to dominate, but it doesn’t.

Given an effective population size of one, what must the selection coefficient be for drift to not dominate selection?

I’d truly appreciate any assistance with the concepts or the math.

In any event, there is no way that this GA (the keiths WEASEL program) either models or simulates evolution.

Reference:

Neutral Theory: The Null Hypothesis of Molecular Evolution

535 thoughts on “Genetic Algorithms: When Drift Overcomes Selection”

Mung on February 10, 2016 at 3:32 am said:

Patrick: A decent, honest person would therefore retract both of those claims.

My mistake for thinking that you might want to set the example for others to follow. Given your past, I should have known better.
Richardthughes on February 10, 2016 at 3:38 am said:

Bad day at the church, Mung?
Mung on February 10, 2016 at 3:38 am said:

Flint: But it’s very hard to guess what he’s trying to say.

It’s much easier to scoff and mock and assume whatever your little heart desires. Given that I am a theist, and that I believe in creation and design, what I actually say hardly matters.

Pretty sure that’s exactly what Lizzie had in mind for this site. Don’t let me disturb your little echo chamber.
Mung on February 10, 2016 at 3:44 am said:

Richardthughes: Bad day at the church, Mung?

Your contribution to this thread would be nil if I had not exposed you as one of the people mocking me for arguing that GAs do not model evolution.

Don’t you think you were being just a tad bit hypocritical in supporting the keiths narrative that I was just making things up?

Nah. Silly me.
Richardthughes on February 10, 2016 at 3:46 am said:

Mung: Your contribution to this thread would be nil if I had not exposed you as one of the people mocking me for arguing that GAs do not model evolution.

I actually pointed out that you failed laughably/miserably at coding a GA.

Mung: Don’t you think you were being just a tad bit hypocritical in supporting the keiths narrative that I was just making things up?

???
Mung on February 10, 2016 at 3:52 am said:

petrushka: If you mean completely novel sequences, we have observed random sequences that can be substituted into bacteria.

How do you know the sequence was random?

What difference does it make whether it was random?

You’ve just confirmed intelligent design is possible.
Flint on February 10, 2016 at 3:53 am said:

Mung: It’s much easier to scoff and mock and assume whatever your little heart desires. Given that I am a theist, and that I believe in creation and design, what I actually say hardly matters.

That says plenty, though.

Pretty sure that’s exactly what Lizzie had in mind for this site. Don’t let me disturb your little echo chamber.

Yes, I also believe Lizzie wanted a site that would concentrate more on science and less on preaching. You know, not just waving your arms, calling names, making assertions, and mocking people. Rather, SHOWING the design, providing a testable mechanism. At the very least, showing WHY GAs don’t model what they appear to model so successfully.

You seem to have something in common with Jonathan Wells, who has studied the actual science NOT to learn or contribute to it, but to find ways to undermine, misrepresent, or baffle with scientistical-sounding bullshit. Your religious approach doesn’t just invite contempt, it EARNS it.
Mung on February 10, 2016 at 4:00 am said:

Patrick: One of my weasel implementations uses crossover. Sexy mustalids, imagine that.

How many implementations do you have?

Maybe I should make a new OP. How many WEASEL implementations does it take to demonstrate the power of cumulative selection? I would have thought one would be enough.

I notice you have a GA engine. Why haven’t you offered it as an example of the generic GA I’ve been asking for? Does it require too much fine tuning before becoming useful?
Mung on February 10, 2016 at 4:04 am said:

Flint: Yes, I also believe Lizzie wanted a site that would concentrate more on science and less on preaching.

Was there too much preaching in my OP, or are GAs and genetic drift just not ‘sciency’ enough? I aim to please.
Richardthughes on February 10, 2016 at 4:12 am said:

Mung: I would have thought one would be enough.

That’s strange. You’ve seen a few yet still seem unsatisfied.
Mung on February 10, 2016 at 4:20 am said:

Richardthughes: I actually pointed out that you failed laughably/miserably at coding a GA.

You made an unsubstantiated assertion. You, like keiths, cannot admit when you are wrong. But I have to give you credit, I can’t recall you ever saying you admit when you are wrong.

My program “failed” because it ran out of memory and it was never coded as a GA and I never claimed it was coded as a GA. Your attempt to rewrite history is pathetic at best and dishonest at worst.

I even posted my code. It’s obvious it’s not coded as a GA. Even you ought to be able to be able to see that (or maybe you really are just a retail clerk.).

Please try to do better.
Mung on February 10, 2016 at 4:21 am said:

Pathetic. Seriously pathetic.
Richardthughes on February 10, 2016 at 4:23 am said:

Mung: You made an unsubstantiated assertion.

Link to it / post it here.

Mung: My program “failed” because it ran out of memory and it was never coded as a GA and I never claimed it was coded as a GA. Your attempt to rewrite history is pathetic at best and dishonest at worst.

So what were you trying to model / demonstrate? You could have posted your souffle recipe.
Joe Felsenstein on February 10, 2016 at 4:35 am said:

Mung: Right. And I decided not to be a pedant about it and kept my mouth shut because it really didn’t much change the point I was making. But thank you for including the haploid case.

Essentially all Weasels I have seen discussed are haploid. I just forgot that in the haploid case the condition should be $2Ns > 1$ rather than $4Ns > 1$ .

Could you explain the disappearance of Ne in the haploid case?

It’s fairly trivial. In the case where there is a single string at the “adult” stage (as in all the Weasels we have discussed here), $N_e = 1$ . That means that the quantity $2N_es$ can be shown, using a difficult proof involving highly advanced mathematics, to be equal to $2s$ .

Some suggestions for turning our little weasels into diploid digital organisms?

Have two strings saved at the adult stage.
Joe Felsenstein on February 10, 2016 at 4:47 am said:

I repeat:

1. When a Weasel has $N$ adults, and a very large number of offspring, and when sampling of $N$ individuals as the adults of the next generation is biased, so that each additional match to the target sequence makes that offspring $1+s$ times more likely to be chosen, the model we have is very close to being a Wright-Fisher model.

2. The WF model is a very important basic tool for analyzing the relative strengths of population-genetic processes. Mung can declare all Mung wants that Weasel processes “don’t model evolution”, but their close relatives the WF processes certainly do a lot of heavy lifting in analyzing evolutionary forces.

3. They allow us to see what are the conditions that are needed for natural selection to have a major effect compared to genetic drift.
keiths on February 10, 2016 at 8:14 am said:

Mung:

Could you explain the disappearance of Ne in the haploid case?

Joe:

It’s fairly trivial. In the case where there is a single string at the “adult” stage (as in all the Weasels we have discussed here), Ne = 1. That means that the quantity 2Ne*s can be shown, using a difficult proof involving highly advanced mathematics, to be equal to 2s.

Oh, snap!

Mung, the proof is similar to the one showing that if you want to know the value of a product, it helps to know more than one of the factors.

Keep up the good mathematical work!
Allan Miller on February 10, 2016 at 8:33 am said:

Mung,

I don’t have any issue with an evolution that is fine tuned to solve specific problems. like GA’s.

How do you distinguish that kind of evolution (which ‘uses’ mutation, crossover, selection, drift) from the other kind?
keiths on February 10, 2016 at 8:45 am said:

Mung, to Rich:

Your contribution to this thread would be nil if I had not exposed you as one of the people mocking me for arguing that GAs do not model evolution.

Rich:

I actually pointed out that you failed laughably/miserably at coding a GA.

Right. He tried to implement Weasel, and this is what happened.

Mung:

Don’t you think you were being just a tad bit hypocritical in supporting the keiths narrative that I was just making things up?

Rich:

???

Mung,

Don’t you think it’s a tad hypocritical to make up a story about me making up a story about you making things up?
keiths on February 10, 2016 at 9:03 am said:

Mung, to Rich:

My program “failed” because it ran out of memory and it was never coded as a GA and I never claimed it was coded as a GA. Your attempt to rewrite history is pathetic at best and dishonest at worst.

Now you’re talking about a different program of yours, which also failed. (And why the quotes around the word “failed”? It failed.) The GA you failed at was an attempt at implementing Weasel. Rich is right.

Regarding the second failed (non-GA) program, it failed because it was poorly implemented. It filled memory with permutations instead of evaluating each permutation as it was generated.
keiths on February 10, 2016 at 9:09 am said:

Mung,

Pathetic. Seriously pathetic.

True, but don’t be so hard on yourself. Treat it as a learning opportunity.
Patrick on February 10, 2016 at 1:46 pm said:

Mung: My mistake for thinking that you might want to set the example for others to follow. Given your past, I should have known better.

Tell you what — If you can provide references to comments in the previous thread where you said what you claimed to have said in this thread, I will publicly apologize. If you can’t, you retract your claim to have done so and your accusations of quote mining. Deal?
Patrick on February 10, 2016 at 1:58 pm said:

Mung: How many implementations do you have?

Maybe I should make a new OP. How many WEASEL implementations does it take to demonstrate the power of cumulative selection? I would have thought one would be enough.

I implemented two. One to show how to do it with my GA engine (that one uses crossover) and one to show that there is no rational reading of Dawkins’ description in “The Blind Watchmaker” that supports the idea of latching. That second one also shows how easy it is to implement the algorithm (174 lines of heavily commented, easy to read C).

I notice you have a GA engine. Why haven’t you offered it as an example of the generic GA I’ve been asking for? Does it require too much fine tuning before becoming useful?

I don’t know what you mean by either “generic” or “fine tuning” in this context. My GA engine implements the basics of mutation (including crossover) and selection. To apply it to a particular problem, just:

1. Create a class to represent the characteristics of the problem
2. Implement a method to create instances of the problem class
3. Implement the required generic functions for the problem:
– genome-length
– fitness
– fitness-comparator
4. Implement a terminator function
5. Run solve

This can solve any problem that:

a) Can be defined in such a way that it is amenable to a GA approach
b) Fits in the memory and storage available on the computer running the engine
c) Requires no more generations to solve than the person running it is willing to wait

I wouldn’t call that generic, but your definition may differ.
petrushka on February 10, 2016 at 2:07 pm said:

This is a kind of observation/question.

In testing my own weasel program I found it worked best with a rather high mutation rate. I fact, the limit on rate seems to coincide with the point at which there are only a few offspring having double mutations. The smaller the number of offspring and the shorter the string being evaluated, the higher the optimum mutation rate.
DNA_Jock on February 10, 2016 at 3:09 pm said:

petrushka,

In order to make the “final” step, you need the wrong letter to mutate to the right letter, AND none of the other (correct) letters to mutate, hence the optimum you noticed.
As the string gets longer, a true weasel’s behavior departs dramatically from the behavior expected for a latching weasel.
🙂
Tried to point this out to kairosfocus, with wholly predictable results…
Of course, you can write a “cheating weasel”, wherein there is exactly one mutation per offspring.
😮
petrushka on February 10, 2016 at 3:22 pm said:

DNA_Jock: In order to make the “final” step, you need the wrong letter to mutate to the right letter, AND none of the other (correct) letters to mutate,

Mostly true, but not logically necessary. It is possible for a letter to mutate to the same letter. I know this happens in my program, because I flag mutated words, and sometimes they are unchanged from the parent.

My own program/game involves making any acceptable Scrabble word. Since there are a rather large number of targets, double mutations are sometimes useful.
DNA_Jock on February 10, 2016 at 3:40 pm said:

pedant,
🙂
Mea culpa, I was using a biologist’s definition of ‘mutate’, rather than a coder’s.
OMagain on February 10, 2016 at 10:07 pm said:

Mung: My program “failed” because it ran out of memory and it was never coded as a GA and I never claimed it was coded as a GA.

I had assumed you programmed for a living. My apologies.
colewd on February 11, 2016 at 12:26 am said:

petrushka,

I think you are very perceptive here regarding the difficulty of creating protein sequences. I challenged one of my friends who was both a programmer and a physicist to create sequence of the alpha helix of hemoglobin which is 141 amino acids. The trial and error problem is large. 20^141 possible combinations. A super computer could not go through this search given the time equal to the age of the universe. The origin of these sequences is not an easy problem. We better get on the highway and search further 🙂 Random change clearly had nothing to do with this process.
petrushka on February 11, 2016 at 12:43 am said:

Not a tornado in a junkyard.

Indeed.
Flint on February 11, 2016 at 1:02 am said:

petrushka:
Not a tornado in a junkyard.

Indeed.

Yes, indeed. Let’s assume a specific target before we start. Let’s pretend selection cannot happen. And THEN, let’s marvel at the astronomical odds against hitting that target. Only a god could have done it, obviously.
Mung on February 11, 2016 at 1:17 am said:

OMagain: I had assumed you programmed for a living. My apologies.

You people crack me up.
DNA_Jock on February 11, 2016 at 1:53 am said:

colewd: Random change clearly had nothing to do with this process.

This is quite a statement.
😉
Curiously, the most successful attempts to produce novel polypeptides in the lab have involved using random mutation and selection iteratively to create proteins that bind extremely tightly to particular targets (Szostak).
keiths on February 11, 2016 at 2:11 am said:

OMagain, to Mung:

I had assumed you programmed for a living.

God forbid! (So to speak.)
GlenDavidson on February 11, 2016 at 2:40 am said:

colewd: The origin of these sequences is not an easy problem. We better get on the highway and search further Random change clearly had nothing to do with this process.

Not an easy problem, but how easy it is for you to judge that one particular process wasn’t a part of it.

Prejudging the matter certainly isn’t the way to find out what happened.

Glen Davidson
Tom English on February 11, 2016 at 3:38 am said:

DNA_Jock: Mea culpa, I was using a biologist’s definition of ‘mutate’, rather than a coder’s.

As a computer scientist, I hate seeing programmers change specifications to suit what they find convenient to implement. That’s what this boils down to.

Mutation is change, and the mutation rate is the rate at which letters actually do change. It’s not the rate at which a code segment that changes a letter with probability 26/27 is executed. Someone who wants to leave the code segment the way it is needs to do something like this:

MUTATION_RATE = 1.0 / SENTENCE_LENGTH

LAZY_MUTATION_RATE = MUTATION_RATE * ALPHABET_SIZE / (ALPHABET_SIZE – 1.0)

Using LAZY_MUTATION_RATE internally will give an effective mutation rate of MUTATION_RATE. If the adjustment were not so easy, the reporting of lazy mutation rates would not be so annoying.
petrushka on February 11, 2016 at 6:54 am said:

The actual mutation rate would be important if you were researching population genetics, but I don’t see the importance in a game.

I do see the need to be aware of definitions.
OMagain on February 11, 2016 at 8:30 am said:

Mung: You people crack me up.

It’s not mutual.
Tom English on February 11, 2016 at 5:46 pm said:

petrushka: The actual mutation rate would be important if you were researching population genetics, but I don’t see the importance in a game.

I do see the need to be aware of definitions.

When neo-Paleyists have a peer-reviewed publication, to be used as evidence in court, of defective analysis of defective Weasel programs attributed to Dawkins by “Oxfordensis,” it is not just a game. What is worst about their analysis, in which the alphabet size ranges from 1 to 100 (it is 27 in Dawkins’s monkey/Shakespeare model of cumulative selection), is that the actual mutation rate depends strongly on the alphabet size. This isn’t just nit-picking on my part. I’m talking about gross distortion in the analysis.

Forgive a batshit77-ish dump of links. They’re all to posts I did in 2010. The first two are the most germane.

Willie can’t stop whipping the Weasel

The roly poly and the cockroach

Sure mutation in Python

Feeling charitable toward Baylor’s IDC cubs
petrushka on February 11, 2016 at 6:51 pm said:

I’ve looked at the papers and will try to implement a correct mutation function and a correct mutation rate.

But what I have is still just a game.
colewd on February 11, 2016 at 6:58 pm said:

GlenDavidson,

I agree. The statement should be corrected to read. A process involving random search is very unlikely to evolve a protein like hemoglobin.
petrushka on February 11, 2016 at 7:02 pm said:

colewd:
GlenDavidson,
I agree.The statement should be corrected to read.A process involving random search is very unlikely to evolve a protein like hemoglobin.

would you accept that the Designer created hemoglobin in [well known] incremental steps, or would you prefer to close your eyes to evidence?
Tom English on February 11, 2016 at 7:12 pm said:

petrushka: But what I have is still just a game.

Wish I hadn’t written “just a game.” Mathematical analyses of games have important applications in science and engineering.
Tom English on February 11, 2016 at 7:34 pm said:

colewd: I challenged one of my friends who was both a programmer and a physicist to createsequence of the alpha helix of hemoglobin which is 141 amino acids.The trial and error problem is large.20^141 possible combinations.A super computer could not go through this search given the time equal to the age of the universe. The origin of these sequences is not an easy problem. We better get on the highway and search further Random change clearly had nothing to do with this process.

You embody the failure of the ID movement. The Center for the Renewal of Science and Culture will never get the creationists that it’s pulled under the Big Tent of ID to stop demonstrating, in the most obvious of ways, that they really are creationists.

I would have no objection to you, as a person, if you identified yourself only as a creationist, and dropped the “intelligent design” deception.
cubist on February 11, 2016 at 7:56 pm said:

colewd:
GlenDavidson,

I agree.The statement should be corrected to read.A process involving random search is very unlikely to evolve a protein like hemoglobin.

Please explain what you mean when you say “like hemoglobin”. In specific, please provide a reasonably clear explanation of what qualities a protein must possess in order for said protein to be considered “like hemoglobin”.
dazz on February 11, 2016 at 8:07 pm said:

Proteins fold like spaghetti. Only the FSM can account for the origin of proteins.
Flint on February 11, 2016 at 8:09 pm said:

colewd:
GlenDavidson,

I agree.The statement should be corrected to read.A process involving random search is very unlikely to evolve a protein like hemoglobin.

What I enjoyed was that (1) you were informed that new proteins in fact ARE discovered in the lab by random mutation plus selection; and (2) you simply ignored this entirely.
cubist on February 11, 2016 at 8:15 pm said:

Flint: What I enjoyed was that (1) you were informed that new proteins in fact ARE discovered in the lab by random mutation plus selection; and (2) you simply ignored this entirely.

Perhaps colewd merely regards those randomly-generated proteins as not being “like hemoglobin”. I hope colewd cares to clarify what, exactly, makes a protein sufficiently “like hemoglobin” that “[a] process involving random search is very unlikely to evolve” said protein.
Flint on February 11, 2016 at 8:33 pm said:

cubist: Perhaps colewd merely regards those randomly-generated proteins as not being “like hemoglobin”. I hope colewd cares to clarify what, exactly, makes a protein sufficiently “like hemoglobin” that “[a] process involving random search is very unlikely to evolve” said protein.

Well, the original observation about how new proteins are actually developed, was in response to his claim that nothing involving random mutations could possibly produce a protein AT ALL. So if he’s not just ignoring this, he’s at least moving the goalposts. NOW he wants something “like hemoglobin”. But of course, those proteins developed in the lab were intended for specific applications, and to have specific properties.

Maybe if it should be the case that someone HAS used evolutionary processes to make a protein “like hemoglobin”, he’ll ignore that (or say it’s not “like” enough for him).

Clearly he’s a creationist, and creationists as a rule do not let reality interfere with their convictions. I think colewd has been applying time-tested evasion techniques like ignoring, moving goalposts, and changing the subject.
colewd on February 11, 2016 at 9:25 pm said:

Tom English,

I am skeptical of evolutionary theory. A little over a year ago I became interested in this debate when I realized the genome was a mathematical sequence. Prior to that I believed in the theory as all of you do. I believe that a creationist is trying to push the facts to support a religious belief. I am not. I am trying to understand the facts. As I told flint if he can surface factually supported evidence for a trial error and selection causing large scale evolutionary change, it will reduce my skepticism. Until then I believe that the mechanism of large scale evolutionary change remains undiscovered.
JoeCoder on February 11, 2016 at 9:31 pm said:

Patrick writes: “[r/creation is] not an underground forum, it’s just a subreddit that was made non-public because the creationists posting there were getting their asses kicked in every argument.”

I’m a moderator of r/creation and I will not see my subreddit slandered in this way. If you think this statement is accurate I would invite you to speak with some of our resident skeptics:

1. From our user fidderstix, an atheist writing in response to a comment similar to your own: “Its mostly u/joecoder posting links to blogs or websites which discuss reviews, popular scientific findings etc. Usually there’s one every day or other day and quite often there’s some very good discussion there. I’m actually involved in a large debate thread and i haven’t been downvoted at all or anything. Apparently atheist comments make up 30 percent of all posts on that sub, despite the fact that it’s locked… There are quite a few of the basic mistakes like mistaking atheism for gnostic atheism and several abuses of fallacy calling. All in all its a pretty good sub, given the user base. U/Joecoder does a great job managing it.”

2. From u/masters1125, a theistic evolutionist: “This has been my experience as well. This sub serves a couple purposes in my opinion. 1. Allows me to learn about YEC arguments, and to share the things I’ve learned about evolution and physics with open-minded creationists. 2. More importantly, there are a few (much fewer than you would expect) rabidly scientifically illiterate people and it gives them a place to rant and yell and say phrases like ‘assumption, true facts, and appeal to authority’ without being downvoted or mocked. This keeps the rest of reddit just a little cleaner. It is also has some of the best moderating I’ve seen on a smaller sub.”

3. From u/darianr: I’m on r/creation and I love it. I’m a theistic evolutionist or evolutionary Creationist and JoeCoder the admin does a fantastic job. The articles some people share are mundane and very basically boring seldom… There are some fantastic creationists and agnostic or atheist commenters there. I learn a lot from both.

4. From u/InsistYouDesist: “the application process at /r/creation pretty much lets anyone in, I asked and said I didn’t think creationism was credible at all but was curious as to what issues they had with evolution. My post history at the time was me arguing with a creationist so I didn’t get my hopes up but they let me in, really polite bunch :)”

Reddit’s atheist forum has several thousand times more users than we do, and we screen our members to keep out the “lol spaghetti monster” crowd, and to keep us at a majority of our own creation and ID members. Reddit already has many subs for debate and we need a place to discuss amongst ourselves–but with enough critics to still test our ideas.

petrushka writes: “I shudder to think what credentials you must have to be invited.”

I recently had an evolutionary biologist and ID skeptics request I revoke his access, because he spent too much time debating there and it was interfering with real life. I was sad to see him go and invited him to return again later.