Genetic Algorithms: When Drift Overcomes Selection

I often encounter posters here at TSZ who claim that Genetic Algorithms (GAs) either model or simulate evolution. They are never quite clear which it is, nor do they say what it means to model or simulate evolution (what would be required) and how GAs qualify as either one or the other. My position is that GAs neither model nor simulate evolution. In addition to other reasons I’ve given in the past I’d like to present the following argument.

GAs are often used to demonstrate “the power of cumulative selection.” Given small population sizes drift ought to dominate yet in GAs drift does not dominate. Why not?

Three questions:

  1. How do we determine the effective population size for a GA?
  2. How do we calculate the value of the selection coefficient?
  3. How do we determine when genetic drift will overcome the effects of selection?

In a GA written by keiths (a version of the WEASEL program) the default population size is 200.

#define POPULATION_SIZE 200 // total population size

Effective population size is the number of individuals in a population who contribute offspring to the next generation.

Even though the population size is 200, only one is selected to contribute offspring to the next generation.

#define NUM_SURVIVORS 1 // number of survivors per generation (must be less than POP_SIZE)

Given an effective population size of one, drift ought to dominate, but it doesn’t.

Given an effective population size of one, what must the selection coefficient be for drift to not dominate selection?

I’d truly appreciate any assistance with the concepts or the math.

In any event, there is no way that this GA (the keiths WEASEL program) either models or simulates evolution.

Reference:

Neutral Theory: The Null Hypothesis of Molecular Evolution

535 thoughts on “Genetic Algorithms: When Drift Overcomes Selection

  1. Mung:

    I often encounter posters here at TSZ who claim that Genetic Algorithms (GAs) either model or simulate evolution. They are never quite clear which it is, nor do they say what it means to model or simulate evolution (what would be required) and how GAs qualify as either one or the other.

    Quotes and links, please. I’m disinclined to take your word for it, for good reason.

    In any event, there is no way that this GA (the keiths WEASEL program) either models or simulates evolution.

    Neither I nor the original inventor of WEASEL, Richard Dawkins, claim that it “models or simulates evolution”.

    You are attacking yet another straw man. Don’t you ever tire of it, Mung? It just makes you look weak and incompetent.

  2. Mung, genetic algorithms are a tool used to investigate the behavior of genetic systems that have selection, recombination, mutation, genetic drift, and sometimes migration. What cases you run depends on what cases you are interested in, what questions you want answered.

    I know a fair amount about them because I wrote my first genetic simulation program 52 years ago, and have written many since. When David Fogel put together his book reprinting early evolutionary simulation papers, I gave him some advice and sources.

    They are useful to investigate outcomes particularly of cases where one cannot do the relevant mathematical theory, for example cases with multiple interacting linked loci.

    As to whether they “simulate evolution”, I’ll let others argue about that. I just know that they are computer models that have multiple genotypes that have fitnesses, have recombination between loci, have mutation processes, and have finite populations, and sometimes multiple populations. Many other complications can also be built in. Population geneticists want to know how such system will behave. I would say that knowing how such systems behave is relevant to understanding what to expect from evolution.

    If your opinion is that they aren’t relevant, well, I’m always happy to answer questions.

  3. Mung:

    How do we determine the effective population size for a GA?
    How do we calculate the value of the selection coefficient?
    How do we determine when genetic drift will overcome the effects of selection?

    Answer to 1: It depends on what you are interested in investigating.

    Answer to 2: In simple cases where each genotype has a known fitness, there may be a simple answer to this by comparing the fitnesses. If the loci interact there may be no simple answer.

    Answer to 3: We could do simulations and find out. That’s the beauty of having genetic simulation programs. In simple cases where there is a selection coefficient s and population size N, we expect natural selection to start having a substantial effect approximately when 4Ns > 1.

  4. Didn’t Mung fail in his attempt to write a GA? Does that qualify him as a non-authority?

  5. Rich,

    Didn’t Mung fail in his attempt to write a GA?

    Yes, completely. He tried to do a version of Weasel, with the following dismal results:

    Mung,

    The code you provided confirms that you don’t understand Weasel or cumulative selection at all.

    For the benefit of folks who don’t read Ruby, here’s what Mung’s program does:

    1. Create a single genotype and initialize it randomly from the set of all unique characters contained in “METHINKS IT IS LIKE A WEASEL”.

    2. Start with the first character in the random genotype. Keep mutating that character and only that character until it matches the “M” in “METHINKS”. Never change it again.

    3. Repeat step 2 for the second character, then the third character, and so on until the end of the genotype is reached, at which point the entire genotype will match the target phrase.

    That’s it, believe it or not.

    Jesus, Mung, no wonder you thought that Weasel would have trouble with changing targets. Your program certainly would.

    At least you copied the target phrase correctly.

    Why not study my code so that you’ll understand what the rest of us are talking about? Or Google “Dawkins weasel” and read some of the informal descriptions of the algorithm?

    Rich:

    Does that qualify him as a non-authority?

    There isn’t a bigger non-authority around, except perhaps for phoodoo.

  6. The failed GA cemented his status as a non-authority, but Mung also drove the point home by

    a) trying and failing to show that there was something illegitimate about Weasel’s demonstration of the power of cumulative selection; and

    b) claiming that a change in population size would make it necessary to change the fitness function, so that the probabilities of reproductive success would add up to one.

  7. Of course genetic algorithms model (some aspects of) evolution. Whether any particular algorithm is a good or bad model can be argued. No model is perfect, and it’s foolish to think they model every aspect.

    Physicists model the solar system all the time in simulations. Why are genetic algorithms not analogous for evolution?

    Models can be extremely useful, as Joe Felsenstein points out, for understanding certain aspects of physical and biological systems, even if they do not capture every single detail.

  8. Progress has been made. We are no longer wasting 600 posts arguing about latching. There were entire years when the great mathematician Demski couldn’t accept the possibility that cumulative selection could occur if “correct” letters continued to be subject to mutation.

    I believe Atom was the first IDist at UD to actually write a WEASEL program and test this. I don’t think he posts there any more.

    One thing GAs cant do is model the phase space. If they could model chemistry, they could model biological evolution. But not with current computer resources.

    Without being able to model the multidimensionality of chemistry, you get static selectors with static fitness. You can’t move into Wagner’s next room. At least not to the degree required to model biology.

  9. If GAs didn’t work, they’d be cited as evidence against evolution by natural selection (among other processes).

    After all, what inspired evolutionary algorithms in the first place?

    The reaction against them is one measure of their success in demonstrating the power of unintelligent selection.

    Glen Davidson

  10. Mung,

    In any event, there is no way that this GA (the keiths WEASEL program) either models or simulates evolution.

    Great, because the arguments of Dembski (later Marks, and then Ewert) depend on fobbing off Dawkins’s monkey/Shakespeare model of cumulative selection as a model of natural selection. In No Free Lunch, Dembski puts the WEASEL in the chapter “Evolutionary Algorithms” (which he identifies as the most important in the book). In Being as Communion, he puts the WEASEL in the chapter “Natural Selection.” Really!

    So, if you want to depart from ID orthodoxy, and observe that it’s insane to conflate evolutionary algorithms and biological models, then more power to you.

  11. Mung,

    I often encounter posters here at TSZ who claim that Genetic Algorithms (GAs) either model or simulate evolution. They are never quite clear which it is, nor do they say what it means to model or simulate evolution (what would be required) and how GAs qualify as either one or the other. My position is that GAs neither model nor simulate evolution.

    I agree with you, although probably not in the way you mean. I am persuaded by Lizzie’s argument in the Evolving Wind Turbines thread:

    Yes. I don’t think it’s a “simulation”. It’s an actual EXAMPLE of Darwinian processes optimising the configuration of something so that it performs a function.

    It’s a simulation in the sense that that the wind-tunnel is a simulation of real wind. But the process itself isn’t a simulation. It’s an exemplar.

    As I noted in that thread, I’m coming to agree with her. These kinds of systems aren’t models of evolution, they are demonstrations of evolution in action. The process itself is taking place in software rather than chemistry and the software version is far less complex than the organic versions we observe, but it is the same process. This system isn’t modeling evolution, it’s implementing evolution.

    There are GAs and EAs that do simulate some aspect of biological evolution, Thomas Schneider’s ev for example, but those are both simulations and exemplars.

  12. It would be an advance if IDists could figure out how GAs work and then advance to discussing similarities and differences to biology. Since there are both similarities and differences, this could take the form of an aduld discussion.

  13. In any event, there is no way that this GA (the keiths WEASEL program) either models or simulates evolution.

    Slow hand clap.

    Next!

  14. If Mung were genuinely interested in learning about Weasel rather than scoring cheap points against it, he could take my code, vary the parameters, and observe the results.

    Besides allowing you to vary operational parameters like population size and mutation rate, the code also allows the intermediate results to be displayed with as much detail as desired.

    // adjustable parameters
    #define POPULATION_SIZE 200 // total population size
    #define NUM_SURVIVORS 4 // number of survivors per generation (must be less than POPULATION_SIZE)
    #define GENOME_LEN 28 // number of loci in each genome (each locus is one character)
    #define MUTATION_RATE 0.01 // probability that a locus is mutated per generation
    #define FITNESS_THRESHOLD 29 // program stops when this fitness threshold is met or exceeded
    #define DISPLAY_INTERVAL 1 // display results every n generations, where n == DISPLAY_INTERVAL
    #define PAUSE_TIME 50000 // pause time in microseconds
    #define GENOMES_TO_DISPLAY 1 // in STEP_MODE, number of (the fittest) genomes to display at each step
    #define STEP_MODE 1 // if set, program will pause every n generations and display the genomes
    #define ENABLE_LATCHING 0 // if set, enable latching
    #define ENABLE_SELECTION 1 // if set, enable selection; otherwise, select survivors randomly

  15. keiths,

    If Mung were genuinely interested in learning about Weasel rather than scoring cheap points against it, he could take my code, vary the parameters, and observe the results.

    Or he could just follow the algorithm that Dawkins described in detail in “The Blind Watchmaker”. It only takes a lunch hour to implement.

  16. I often encounter posters here at TSZ who claim that Genetic Algorithms (GAs) either model or simulate evolution.

    I’m not sure I have seen that claim, though perhaps I would not have noticed it.

    You cannot model or simulate evolution without modeling or simulating the changing environment. But the environment is too hard and too unpredictable to model or simulate.

    I take GAs to model some aspects of evolution. And what aspects they model presumably depends on the programmers goals. It seems to me that GAs can be useful for this, as long as we understand their limitations.

  17. keiths: I’m disinclined to take your word for it, for good reason.

    Assume I am posting in good faith. If you can’t do that perhaps this is not the site for you.

  18. Richardthughes: Didn’t Mung fail in his attempt to write a GA? Does that qualify him as a non-authority?

    No Mung did not fail his attempt to write a GA, and I don’t claim to be an authority.

    But speaking of failures, you were invited to gather all your GA-writing friends to show us how it’s done and failed to do so.

  19. Joe Felsenstein: Mung: How do we determine the effective population size for a GA?

    Answer to 1: It depends on what you are interested in investigating.

    I think you misunderstood the question. I don’t want to know what value to use for the population size, as you say that may vary depending on the problem.

    I want to know, given an already coded GA, how do we determine the effective population size. In the example so graciously provided by keiths, I arrived at a figure of 1, because only one ‘organism’ survives to produce 200 offspring.

    I want to know if my thinking is correct on that.

    The idea here is to examine when drift ought to overcome the effects of selection and if it is not doing so, attempt to understand why not.

    If Ne=1 why isn’t drift the dominant factor?

    Are we really seeing “the power of cumulative selection” or are we seeing “the power of cumulative random sampling”?

  20. Tom English: So, if you want to depart from ID orthodoxy, and observe that it’s insane to conflate evolutionary algorithms and biological models, then more power to you.

    I’m a maverick, like Salvador. 🙂

  21. Mung: Assume I am posting in good faith. If you can’t do that perhaps this is not the site for you.

    You may be posting in good faith, but have a history of being wrong. Just like WJM.

  22. Mung: No Mung did not fail his attempt to write a GA, and I don’t claim to be an authority.

    But speaking of failures, you were invited to gather all your GA-writing friends to show us how it’s done and failed to do so.

    Fun with GAs has continued without your involvement in my threads Mung. I don’t know if that’s because they were beyond you, you didn’t like the results or some other reason.

  23. keiths: Besides allowing you to vary operational parameters like population size and mutation rate, the code also allows the intermediate results to be displayed with as much detail as desired.

    I’d like to compare your mutation rate to the rate of fixation.

  24. Richardthughes: You may be posting in good faith, but have a history of being wrong.

    Yes. Yes. Even when I’m right I am wrong. That’s a given here at TSZ. I’ve learned to live with it and not accuse everyone who posts here of posting in bad faith.

  25. Joe Felsenstein: Mung: How do we calculate the value of the selection coefficient?

    Answer to 2: In simple cases where each genotype has a known fitness, there may be a simple answer to this by comparing the fitnesses. If the loci interact there may be no simple answer.

    In the example GA provided by keiths, 199 members of the population get killed off and one member survives and produces 200 offspring to seed the next generation.

    Can the selection coefficient be calculated from that information?

  26. Richardthughes: So we agree that KeithS is justified in not taking your word for it.

    From the fact that you agree with keiths it does not follow that he is justified.

  27. If someone has made X truth claims and Y have been right:

    1) what are the odds that next (X+1) truth claim is correct?
    2) what is a good threshold to accept claims from a source being true?

  28. Rich:

    Fun with GAs has continued without your involvement in my threads Mung. I don’t know if that’s because they were beyond you, you didn’t like the results or some other reason.

    Both (a) and (b).

    Mung:

    Yes. Yes. Even when I’m right I am wrong.

    When you’re wrong, you’re wrong, and that happens so often that no one needs to pretend that you’re wrong when you’re right.

    In the example GA provided by keiths, 199 members of the population get killed off and one member survives and produces 200 offspring to seed the next generation.

    No. One member survives and is retained in the population, while 199 members are killed off and replaced by mutated copies of the survivor:

    // preserve the survivors, but convert the rest into mutated copies of the survivors
    for (i=0; i < POPULATION_SIZE – NUM_SURVIVORS; i++) {
    mutate(&genome_array[i%NUM_SURVIVORS], &genome_array[i+NUM_SURVIVORS]);
    }

    Mung:

    Given an effective population size of one, drift ought to dominate, but it doesn’t.

    Are you expecting that Weasel will fail to converge if drift dominates?

  29. Ah, I see what Mung is thinking — and he’s wrong, of course.

    Mung, to Joe:

    The idea here is to examine when drift ought to overcome the effects of selection and if it is not doing so, attempt to understand why not.

    If Ne=1 why isn’t drift the dominant factor?

    Are we really seeing “the power of cumulative selection” or are we seeing “the power of cumulative random sampling”? [Emphasis added]

    He’s hoping that Weasel somehow cheats and uses drift, not selection, to achieve its success.

    That’s goofy. Without selection, Weasel won’t converge on its target. Mung can demonstrate that for himself by setting ENABLE_SELECTION to 0 and running my program.

    He’s got a bad case of weaselphobia.

    Sorry, Mung, but you’re going to have to suck it up. Weasel works, and it demonstrates the power of cumulative selection.

  30. Not to mention the stupidity of demanding a “general purpose” GA or else evolution is false. Mung never ceases to entertain

  31. Not only did he demand a general-purpose GA — he wants one that can solve any problem:

    Meanwhile, you and all the other fans of GA’s here at TSZ just can’t be bothered to create an OP and publish the generic GA that can solve any problem.

    Of course, no one here claimed that there was such a GA.

    Meanwhile, Mung is being chased by a Weasel.

  32. Mung: In the example GA provided by keiths, 199 members of the population get killed off and one member survives and produces 200 offspring to seed the next generation.

    Can the selection coefficient be calculated from that information?

    The effective population size then is 1. The selection coefficient is basically infinite, in favor of any mutant that brings one closer to the target, because it is always chosen. So no problem satisfying 4Ns > 1.

    If one wanted to try a lower selection coefficient, you’d need to modify the program so that newborns are evaluated as to how many changes they have that are closer to the target. Then one would put a weight on each offspring that was 1+s raised to the k-th power, where k was the number of favorable mutants the individual possessed.

    Then sample one survivor from these newborns, where the sampling is weighted by that weight. This would make the model nearly a Wright-Fisher model. Movement toward the target would be slower. I’d predict that if s < 1/4 it would often not get very close. Raising N would help.

  33. keiths,

    Start with a random string of 28 characters.
    Make 100 copies of the string (reproduce).
    For each character in each of the 100 copies, with a probability of 5%, replace (mutate) the character with a new random character.
    Compare each new string with the target string “METHINKS IT IS LIKE A WEASEL”, and give each a score (the number of letters in the string that are correct and in the correct position).
    If any of the new strings has a perfect score (28), halt. Otherwise, take the highest scoring string, and go to step 2.

    Is this the same basic algorithm that you are using? Do you compare the mutation events against a target?

  34. That’s the essential weasel algorithm, unless there’s some subtle typo I didn’t notice.

  35. Richardthughes, there is a glaring problem with your claim and the claim of keiths and that is that keiths agrees with me. Maybe keiths is wrong too though.

    Further, here you are in a prior thread:

    Richardthughes:
    Shit GAs work! That means that they must not model evolution…

    You wouldn’t be one of those that thinks that GAs model evolution though.

  36. dazz:
    Not to mention the stupidity of demanding a “general purpose” GA or else evolution is false. Mung never ceases to entertain.

    Oh good. dazz showed up. Speak of entertaining:

    dazz:
    Mung: GA’s do not model Darwinian evolution

    You can kick and moan all you want buddy, but they do.

  37. keiths: One member survives and is retained in the population, while 199 members are killed off and replaced by mutated copies of the survivor:

    A fine bit of pedantry!

  38. keiths: He’s hoping that Weasel somehow cheats and uses drift, not selection, to achieve its success.

    Um, no. The question is, given an effective population size of 1, why doesn’t drift overcome the effects of selection.

  39. petrushka,
    I know this programs algorithm is 30 years old. Are there any newer cumulative mutation simulators that can find a sequence without a target?

  40. dazz: Not to mention the stupidity of demanding a “general purpose” GA or else evolution is false.

    I never made such a claim.

  41. Mung: Um, no. The question is, given an effective population size of 1, why doesn’t drift overcome the effects of selection.

    Drift takes over when the selective pressure is low. But here you have high selective pressure.

  42. keiths: Not only did he demand a general-purpose GA — he wants one that can solve any problem.

    That is false. My claim was that it can’t be done.

    keiths:
    Of course, no one here claimed that there was such a GA.

    AFIAK, no one claimed there wasn’t such a GA, except me. You have evidence to the contrary?

  43. Neil Rickert: Drift takes over when the selective pressure is low.

    From my link in the OP:

    In other words, in small populations, the stochastic effects of random genetic drift overcome the effects of selection.

    But here you have high selective pressure.

    Given a small effective population size, how high does it have to be before it has a greater effect than drift?

  44. Neil Rickert: Drift takes over when the selective pressure is low.But here you have high selective pressure.

    Quibble: Drift is always there but selection gets stronger once the selection coefficient is high enough. (In this case selection is extremely strong).

Leave a Reply