Genetic Algorithms: When Drift Overcomes Selection

I often encounter posters here at TSZ who claim that Genetic Algorithms (GAs) either model or simulate evolution. They are never quite clear which it is, nor do they say what it means to model or simulate evolution (what would be required) and how GAs qualify as either one or the other. My position is that GAs neither model nor simulate evolution. In addition to other reasons I’ve given in the past I’d like to present the following argument.

GAs are often used to demonstrate “the power of cumulative selection.” Given small population sizes drift ought to dominate yet in GAs drift does not dominate. Why not?

Three questions:

  1. How do we determine the effective population size for a GA?
  2. How do we calculate the value of the selection coefficient?
  3. How do we determine when genetic drift will overcome the effects of selection?

In a GA written by keiths (a version of the WEASEL program) the default population size is 200.

#define POPULATION_SIZE 200 // total population size

Effective population size is the number of individuals in a population who contribute offspring to the next generation.

Even though the population size is 200, only one is selected to contribute offspring to the next generation.

#define NUM_SURVIVORS 1 // number of survivors per generation (must be less than POP_SIZE)

Given an effective population size of one, drift ought to dominate, but it doesn’t.

Given an effective population size of one, what must the selection coefficient be for drift to not dominate selection?

I’d truly appreciate any assistance with the concepts or the math.

In any event, there is no way that this GA (the keiths WEASEL program) either models or simulates evolution.

Reference:

Neutral Theory: The Null Hypothesis of Molecular Evolution

535 thoughts on “Genetic Algorithms: When Drift Overcomes Selection

  1. Nevermind:

    Because of this, sequential fixation and stochastic tunneling are
    thought to be the primary ways that complex adaptations become
    fixed. However, in view of the fact that the underlying limitation is
    an unavoidable aspect of statistics—that independent rare events
    only very rarely occur in combination—it seems certain that all
    chance-based mechanisms must encounter it.

    When I browse through Axe’s argument I am struck by the [hidden] assumption that whatever happened was planned or foreseen or pre-specified.

    The smarter creationists avoid obvious mathematical mistakes, but I don’t see them discussing the texas sharpshooter problem. That is, the problem of painting the bulls eye after the bullet has hit something.

    Whether the mutations required for a complex adaptation occur sequentially or in parallel is important, but not as important as the logical error of assuming that what happened had to happen.

    Do you understand that winning the lottery can be absolutely blocked by having car trouble on the way to buying the ticket, but that this is a separate problem and unrelated related to whether your numbers come up?

    Lynch seems to address the problem of whether it is possible to buy the ticket, despite obstacles. Axe seems obsessed with whether it is reasonable that John Doe won.

    The big problem that creationists have is in assuming that the winner was destined to win, and that evolution has to explain why John won and not someone else.

  2. Patrick,

    That software link you provided doesn’t look like it matches the specifications I provided. Mine at least tries to model S-coefficient concepts.

  3. Sal,

    This is just goofy.

    1. The “flaw” in Weasel isn’t a flaw. Dawkins never intended for Weasel to be an accurate model of biological evolution. It simply demonstrates the power of cumulative selection.

    2. An implementation of “Drifting Weasel” wouldn’t demonstrate anything that scientists don’t already know. If the product of the effective population size and the selection coefficient is low enough, then drift will overcome selection. It’s well known, and the Nature article emphasizes it:

    It is important to note that the effectiveness of selection on a mutation depends both on the fitness effect of this mutation (the selection coefficient s) and on the effective population size (Ne). Specifically, when the product Ne * s is much less than 1, the fate of mutations is essentially determined by random genetic drift. In other words, in small populations, the stochastic effects of random genetic drift overcome the effects of selection. Thus, all mutations for which Ne * s is much less than 1 can be considered effectively neutral.

    If you simply want to demonstrate that evolutionary biologists are right, then go ahead and implement Drifting Weasel. The exploding weasel graphic will be as pointless and puerile as the fart noises in the Judge Jones video.

  4. stcordova,

    it only has a 4% chance of getting passed on to the next generaiton, and there is a 96% chance bad mutations get passed on to the next generation!

    That’s a rather confusing spec. Beneficial mutations having a lesser chance than detrimental of getting passed on to the next generation would be in complete opposition to what beneficial and detrimental reduce to. A bit like coding a fair die where one of the faces has a 1% chance of coming up, and another 37%.

    What you might mean is that for every beneficial mutation, 26 detrimental ones are produced. Be that as it may, I confidently predict that the genomes will still become enriched in beneficial mutations with s values exceeding the minimum satisfying the condition that the product of population size and s (Ne * s) > 1. Surviving variation will be strongly skewed in favour of the beneficial.

  5. Drift actually helps GAs. It broadens out the search space, provides paths not accessible to incremental benefit, and allows discovery of higher optima (Weasel is immune to this constraint because there are no unavailable paths).

    Same goes for crossover and mutation … to an extent. There is a Happy Zone where bimbling about is useful. No doubt people can imagine situations where this is not true, and apply this intuition to the entire arena.

  6. Allan Miller: That’s a rather confusing spec. Beneficial mutations having a lesser chance than detrimental of getting passed on to the next generation would be in complete opposition to what beneficial and detrimental reduce to. A bit like coding a fair die where one of the faces has a 1% chance of coming up, and another 37%.

    It seems more like trying to reverse the definition of beneficial and detrimental.

    But if a mutation is actually detrimental, it will have less chance of being passed on.

    The ratio of detrimental to beneficial that Sal is advocating seems like nothing more than a rewording of his claim that genomes deteriorate.

  7. stcordova,
    That’s not an underground forum, it’s just a subreddit that was made non-public because the creationists posting there were getting their asses kicked in every argument.

    It evolved to something different once it went private. 🙂

  8. Allan:

    There is a Happy Zone where bimbling about is useful.

    I want to bimble about in the Happy Zone!

    I’d never heard “bimbling” before. Is that a Britishism?

  9. That’s a rather confusing spec.

    What’s the population size? If only one survivor per generation, this won’t be much better than a random walk since 96% of the time a new bad mutant gets “fixed” into the population for the next run, and only 4% of the time does a good one get “fixed” for the next run. Hence the rate of bad (away from the target) is flowing in at 24 times the rate of the good. The good will be overwhelmed and the string will essentially go toward getting randomized.

    That illustrates the problems of naively believing complexity can be built up cumulatively with real world genomes, mutations, and high probability of mutations breaking or failing to create coordinated relationships that are as complex if not more so than integrated software.

    Any way, Mung was curious what happens when drift is admitted into the model. Short answer: WEASEL explodes — kaboom!

  10. petrushka,

    What Axe is claiming is that Lynch’s model is assuming when you get a positive mutation and move closer to the target that mutation and the target gene will remain fixed and not be affected by a delirious mutation going forward. If this is what Lynch’s model is assuming then Axe’s criticism is valid. I need to spend more time with the calculations and see if that is what Lynch is saying. I am also not sure if Axe’s model moves it into the only one solution fits category as you mentioned. Here is the argument from Axe’s paper.

    This indeed appears to be the case. Specifically, of all the pos- sible evolutionary paths a population can take, the analysis of Lynch and Abegg considers only those special paths that lead di- rectly to the desired end—the complex adaptation. This is best illustrated with an example. Suppose a population carries an al- lele that confers no selective benefit in its current state (e.g., a pseudogene or a gene duplicate) but which would confer a benefit if it were to acquire five specific nucleotide changes relative to that initial state, which we will again refer to as stage 0. Lynch and Abegg assign a waiting time of (5u)-1 for a stage-1 allele to become fixed in this situation, which is valid only if we can safely assume that the population remains at stage 0 during this wait. But this cannot be assumed. A stage-0 allele of kilobase length, for example, would have about 200-fold more correct bases than incorrect ones (with respect to the complex adaptation), which means the rate of degradation (i.e., fixation of changes that make the complex adaptation more remote) would be about 600-fold higher3 than the rate of progression to stage 1. It is therefore very unlikely in such a case that the population will wait at stage 0 long enough to reach stage 1, and the situation becomes progressively worse as we consider higher stages.

  11. stcordova: That illustrates the problems of naively believing complexity can be built up cumulatively with real world genomes

    He said, having demonstrated nothing of the sort.

  12. stcordova: That illustrates the problems of naively believing complexity can be built up cumulatively with real world genomes, mutations, and high probability of mutations breaking or failing to create coordinated relationships that are as complex if not more so than integrated software.

    I don’t exactly follow how you can derive anything from a program that is designed to behave in ways that have nothing to do with biology.

  13. stcordova,

    96% of the time a new bad mutant gets “fixed” into the population for the next run, and only 4% of the time does a good one get “fixed”

    What makes them ‘good’ or ‘bad’? I’m glad I don’t have to do requirements gathering with you.

  14. colewd,

    Biologists do not attempt to predict the path of evolution, nor do they assume targets. I await the analysis of smarter people than me, but it appears to me that Lynch is addressing the possibility that something that has already happened could have happened.

    Axe seems to be calculating the Demski probability that something specific will happen.

    The two concepts are not equivalent, and it is not surprising that when Axe sets up a lab experiment to see if a specific thing will happen, it doesn’t. But he is not asking a question that a sane biologist would ask.

    Again, it looks like the rather typical creationist argument stemming from the assumption that what is was meant to be.

  15. stcordova,

    Any way, Mung was curious what happens when drift is admitted into the model. Short answer: WEASEL explodes — kaboom!

    Literally? 😀

    Your approach to code is reminiscent of phoodoo’s – “I’ll tell you this for nothing, and I don’t have to write no stupid program”.

    If you make the program stochastic, it will aproach the target more slowly. The extent depends on the degree of stochasticity you introduce. But of course you could just as easily introduce (more) stochasticity into the mutational process as well – eg by setting fidelity to zero. There are – quelle surprise – parameter settings that will not result in success.

  16. I am amused that Axe could make this critique of Lynch and Abegg :

    Specifically, of all the pos- sible evolutionary paths a population can take, the analysis of Lynch and Abegg considers only those special paths that lead di- rectly to the desired end—the complex adaptation.

    when he is also the author of this.

  17. DNA_Jock:
    I am amused that Axe could make this critique of Lynch and Abegg :
    when he is also the author of this.

    Might be a good time for a thread on Axe.

  18. Sal,

    Any way, Mung was curious what happens when drift is admitted into the model. Short answer: WEASEL explodes — kaboom!

    Um, no. There is already drift in Weasel, yet it converges just fine.

    Weasel will fail with the right parameter settings, just as real-life selection will fail under the right conditions.

    In other words, the evolutionists are right.

  19. I can’t tell whether Sal believes what he is saying or is just performing for the reddit /r/creation-ists.

    Either way, I wonder if any of them are dim enough to buy what he’s selling.

  20. stcordova,

    That software link you provided doesn’t look like it matches the specifications I provided. Mine at least tries to model S-coefficient concepts.

    No, you just asked for an example of a GA that doesn’t guarantee that the most fit genome always survives and the least fit never does. I provided that example.

  21. petrushka,

    Biologists do not attempt to predict the path of evolution, nor do they assume targets. I await the analysis of smarter people than me, but it appears to me that Lynch is addressing the possibility that something that has already happened could have happened.

    I agree with this and would like to get more comments on this debate. I think the right question to ask is an observed adaption likely to be caused by a trial and error mechanism?

  22. dazz:

    Keiths, can’t seem to find your weasel algo. Care to repost please?

    http://pastebin.com/syLy2UJP

    You can compile it under Linux with ‘gcc -std=gnu99 weasel.c -o weasel’.

    ETA: Note that this version has been changed to mutate the survivors as well as their offspring. You can convert it back to the old version by commenting out these lines:

    // now mutate the survivors themselves
    for (i=0; i<NUM_SURVIVORS; i++) {
      mutate(&genome_array[i], &genome_array[i]);
    }

  23. colewd: I agree with this and would like to get more comments on this debate. I think the right question to ask is an observed adaption likely to be caused by a trial and error mechanism?

    That’s a bit like asking how likely it is that your parents met at the right time and had sex at the right time to produce exactly you. Regress that back a few thousand generations and compute the probability.

    The only question we have any hope of answering is, were there an insurmountable obstacles that suggest some miraculous intervention?

    Behe confronts this directly and tries to show a few examples of adaptations that were “impossible”. Malaria drug resistance being the prime example.

    I suppose the biggest problem with Behe’s analysis is that we know from observation that his impossible scenario takes about ten years in real life.

    I would suggest a few cautions for creationists. One would be that we don’t know every possible pathway linking a before and after sequence. Two would be that just because we see one detrimental step doesn’t mean it was actually detrimental. It may have been “rescued” by some compensating mutation that we failed to observe.

    But drug resistance is a rather special case. If bacteria always came up with resistance, none of our antibiotics would ever work. There’s no guarantee that need will produce adaptation. And no reason to believe that adaptations we see were specified or designed, or anything other than what happened.

  24. keiths: Note that this version has been changed to mutate the survivors as well as their offspring.

    Ah, so that’s what allows for drift then, right?

    Thanks

  25. dazz,

    Ah, so that’s what allows for drift then, right?

    You’ll still see drift without those lines, but it will be much less frequent.

    That’s because in the newer version, neutral mutations in the survivor will automatically become fixed unless the survivor is displaced by one of its offspring. In the older version, neutral mutations can only become fixed by “piggybacking” on an offspring that is about to displace the survivor. (All of the above assuming that NUM_SURVIVORS is set to 1.)

    So what Mung regarded as “pedantry” on my part is actually an important distinction.

  26. petrushka,

    That’s a bit like asking how likely it is that your parents met at the right time and had sex at the right time to produce exactly you. Regress that back a few thousand generations and compute the probability.

    If we ask the question what is the cause of me being here the mechanism is easy to describe. And the mechanism is also fun 🙂 Putting a sperm and egg together to create a new animal is a very well understood and repeatable process. The process that creates novel genes is a little foggier.

  27. colewd: I think the right question to ask is an observed adaption likely to be caused by a trial and error mechanism?

    Close, but no cigar.
    A question one could ask is

    Given this observed adaptation, what is the probably that a trial and error mechanism could have caused it?

    but the more relevant question to ask is

    Given this observed adaptation, what is the probably that a trial and error mechanism could have caused {it or something equivalent to it}?

    Which is a much trickier question…
    This is the Texas Sharp Shooter issue that petrushka was referring to, which point you apparently missed.

    What’s the probability that you would get the combination of grandparental chromosomes that you did? 0.5^46 = 1 in 70 million million.
    That’s before we consider recombination, or your parents ‘getting to know each other’, or where their genes came from…
    The probability of YOU is vanishingly, UPB small.

  28. keiths: In the older version, neutral mutations can only become fixed by “piggybacking” on an offspring that is about to displace the survivor

    That’s exactly what I was missing. I see what you mean now. Thanks again

  29. DNA_Jock,

    Given this observed adaptation, what is the probably that a trial and error mechanism could have caused {it or something equivalent to it}?

    I agree with you. Thanks for the improvement. The mechanism has to create some useful biological function. Do you agree that when the evolutionary process proceeds that the functional requirements become more specific?

  30. colewd:
    DNA_Jock,

    I agree with you.Thanks for the improvement.The mechanism has to create some useful biological function.

    As I understand it, not necessarily, or at least not directly. Lenski’s e coli that developed the knack of metabolizing citrate needed two mutations, neither of which was useful by itself. I kinda recall that several of his populations actually developed one or the other, but since they weren’t useful, these were lost. But you could say that ultimately “the mechanism” did produce a useful function.

    Do you agree that when the evolutionary process proceeds that the functional requirements become more specific?

    Why would that be? The evolutionary process has been going on here on earth for nearly 4 billion years, and the functional requirements today are no more specific than they have ever been.

  31. colewd:

    DNA_Jock,
    Given this observed adaptation, what is the probably that a trial and error mechanism could have caused {it or something equivalent to it}?

    I agree with you.Thanks for the improvement.The mechanism has to create some useful biological function.

    Not so sure we agree; perhaps I could have expressed myself better. Regarding the TSS problem, “equivalent” would refer to any change conferring a selectable advantage that is as improbable or more improbable than the observed adaptation…You need to consider the universe of ‘equivalently’ unlikely outcomes, not just the one you observe. Hence the analogy to the mind-blowingly unlikely nature of YOUR birth. Asking for the probability of something that is functionally ‘equivalent’ to some particular adaptation that you choose is cherry-picking: as good as catalase or as good as rubisco?

    Do you agree that when the evolutionary process proceeds that the functional requirements become more specific?

    No, I do not. It is true that for some functions, the local optima tend to become narrower at higher levels of optimization – many enyzme sequences, for instance. Could you perhaps explain this to kairosfocus, so that he will stop using Durston’s data to estimate the rarity of islands of function? But for big, messy, heavily networked systems like the control of gene expression in eukaryotes, the requirements for a selectable improvement in reproductive ability remain very non-specific, and the number of potential ways to improve are myriad…

  32. Flint: Well, technically he didn’t say it CAN be done…

    Yeah, that might be important, given that people are trying to claim the exact opposite.

    …he was simply mocking others for failure to “bother” to produce one.

    Yes, he knew it was a fool’s errand. Couldn’t be done.

  33. Mung: At least you admit to quote-mining.

    Except that (1) he admitted to no such thing, and (2) the quoted statement in no way misrepresented your point.

  34. Mung: At least you admit to quote-mining.

    I find it unlikely that I might or might not agree with Mung or perhaps inadvertently, but not without any trace of possible notational foreknowledge, seemingly contradict or confirm the appearance of misunderstanding.

    I hope that clarifies my position.

  35. Mung: Yeah, that might be important, given that people are trying to claim the exact opposite.

    Yes, he knew it was a fool’s errand. Couldn’t be done.

    Hopefully, you can see that your comment can be easily interpreted in different ways. It could be EITHER the case that you were saying it was easy, but nobody had made the effort, OR that it is impossible which is why there isn’t one.

    You actually said neither that it can nor that it can’t be done. However, you made no equivalent effort to mock the ID folks for not producing one themselves.

    It’s all too easy to write ambiguous statements when YOU know what you mean, and expect others to also know what you mean. I’d venture that most ambiguous statements are made without the realization that misinterpretation is possible, and might even be the most obvious interpretation to others.

  36. petrushka: I find it unlikely that I might or might not agree with Mung or perhaps inadvertently, but not without any trace of possible notational foreknowledge, seemingly contradict or confirm the appearance of misunderstanding.

    I hope that clarifies my position.

    Well, why didn’t you just SAY so in the first place?

  37. petrushka: I find it unlikely that I might or might not agree with Mung or perhaps inadvertently, but not without any trace of possible notational foreknowledge, seemingly contradict or confirm the appearance of misunderstanding.

    I hope that clarifies my position.

    Surely there’s at least some small range of uncertainty, though.

    Glen Davidson

  38. keiths: The difference has a considerable effect on the amount of drift.

    First, you should tell us how to measure “the amount of drift” in your program. Then perhaps we can test for how it is affected. But I’m betting you won’t.

    Then you can explain how drift in your program overcomes “[a] selection coefficient [that] is basically infinite.”

    If you subjected that last member of the population to copying and mutation and the mutation rate was low enough there would be a probability that it would be no different than the parent. So there would be no difference. There’s no need to keep a copy of the parent untouched by mutation. Why pretend otherwise?

  39. keiths: Turn off selection and the program won’t converge on the target.

    LoL. So?

    From vastly different starting populations your program always converges on the target. Congratulations, you’ve just demonstrated convergence!

    Are you ever going to address the question raised by the OP?

  40. keiths: But it isn’t population size alone that allows drift to overpower selection. The selection coefficient also matters, as the Nature article makes clear:

    Did you even read the OP?

    2. How do we calculate the value of the selection coefficient?

    Given an effective population size of one, what must the selection coefficient be for drift to not dominate selection?

    Typical Mung sloppiness.

    Hilarious. Next time read the OP.

Leave a Reply