I often encounter posters here at TSZ who claim that Genetic Algorithms (GAs) either model or simulate evolution. They are never quite clear which it is, nor do they say what it means to model or simulate evolution (what would be required) and how GAs qualify as either one or the other. My position is that GAs neither model nor simulate evolution. In addition to other reasons I’ve given in the past I’d like to present the following argument.
GAs are often used to demonstrate “the power of cumulative selection.” Given small population sizes drift ought to dominate yet in GAs drift does not dominate. Why not?
Three questions:
- How do we determine the effective population size for a GA?
- How do we calculate the value of the selection coefficient?
- How do we determine when genetic drift will overcome the effects of selection?
In a GA written by keiths (a version of the WEASEL program) the default population size is 200.
#define POPULATION_SIZE 200 // total population size
Effective population size is the number of individuals in a population who contribute offspring to the next generation.
Even though the population size is 200, only one is selected to contribute offspring to the next generation.
#define NUM_SURVIVORS 1 // number of survivors per generation (must be less than POP_SIZE)
Given an effective population size of one, drift ought to dominate, but it doesn’t.
Given an effective population size of one, what must the selection coefficient be for drift to not dominate selection?
I’d truly appreciate any assistance with the concepts or the math.
In any event, there is no way that this GA (the keiths WEASEL program) either models or simulates evolution.
Reference:
Mung:
Quotes and links, please. I’m disinclined to take your word for it, for good reason.
Neither I nor the original inventor of WEASEL, Richard Dawkins, claim that it “models or simulates evolution”.
You are attacking yet another straw man. Don’t you ever tire of it, Mung? It just makes you look weak and incompetent.
Mung, genetic algorithms are a tool used to investigate the behavior of genetic systems that have selection, recombination, mutation, genetic drift, and sometimes migration. What cases you run depends on what cases you are interested in, what questions you want answered.
I know a fair amount about them because I wrote my first genetic simulation program 52 years ago, and have written many since. When David Fogel put together his book reprinting early evolutionary simulation papers, I gave him some advice and sources.
They are useful to investigate outcomes particularly of cases where one cannot do the relevant mathematical theory, for example cases with multiple interacting linked loci.
As to whether they “simulate evolution”, I’ll let others argue about that. I just know that they are computer models that have multiple genotypes that have fitnesses, have recombination between loci, have mutation processes, and have finite populations, and sometimes multiple populations. Many other complications can also be built in. Population geneticists want to know how such system will behave. I would say that knowing how such systems behave is relevant to understanding what to expect from evolution.
If your opinion is that they aren’t relevant, well, I’m always happy to answer questions.
Mung:
Answer to 1: It depends on what you are interested in investigating.
Answer to 2: In simple cases where each genotype has a known fitness, there may be a simple answer to this by comparing the fitnesses. If the loci interact there may be no simple answer.
Answer to 3: We could do simulations and find out. That’s the beauty of having genetic simulation programs. In simple cases where there is a selection coefficient s and population size N, we expect natural selection to start having a substantial effect approximately when 4Ns > 1.
Didn’t Mung fail in his attempt to write a GA? Does that qualify him as a non-authority?
Rich,
Yes, completely. He tried to do a version of Weasel, with the following dismal results:
Rich:
There isn’t a bigger non-authority around, except perhaps for phoodoo.
The failed GA cemented his status as a non-authority, but Mung also drove the point home by
a) trying and failing to show that there was something illegitimate about Weasel’s demonstration of the power of cumulative selection; and
b) claiming that a change in population size would make it necessary to change the fitness function, so that the probabilities of reproductive success would add up to one.
Of course genetic algorithms model (some aspects of) evolution. Whether any particular algorithm is a good or bad model can be argued. No model is perfect, and it’s foolish to think they model every aspect.
Physicists model the solar system all the time in simulations. Why are genetic algorithms not analogous for evolution?
Models can be extremely useful, as Joe Felsenstein points out, for understanding certain aspects of physical and biological systems, even if they do not capture every single detail.
Progress has been made. We are no longer wasting 600 posts arguing about latching. There were entire years when the great mathematician Demski couldn’t accept the possibility that cumulative selection could occur if “correct” letters continued to be subject to mutation.
I believe Atom was the first IDist at UD to actually write a WEASEL program and test this. I don’t think he posts there any more.
One thing GAs cant do is model the phase space. If they could model chemistry, they could model biological evolution. But not with current computer resources.
Without being able to model the multidimensionality of chemistry, you get static selectors with static fitness. You can’t move into Wagner’s next room. At least not to the degree required to model biology.
If GAs didn’t work, they’d be cited as evidence against evolution by natural selection (among other processes).
After all, what inspired evolutionary algorithms in the first place?
The reaction against them is one measure of their success in demonstrating the power of unintelligent selection.
Glen Davidson
Mung,
Great, because the arguments of Dembski (later Marks, and then Ewert) depend on fobbing off Dawkins’s monkey/Shakespeare model of cumulative selection as a model of natural selection. In No Free Lunch, Dembski puts the WEASEL in the chapter “Evolutionary Algorithms” (which he identifies as the most important in the book). In Being as Communion, he puts the WEASEL in the chapter “Natural Selection.” Really!
So, if you want to depart from ID orthodoxy, and observe that it’s insane to conflate evolutionary algorithms and biological models, then more power to you.
Mung,
I agree with you, although probably not in the way you mean. I am persuaded by Lizzie’s argument in the Evolving Wind Turbines thread:
As I noted in that thread, I’m coming to agree with her. These kinds of systems aren’t models of evolution, they are demonstrations of evolution in action. The process itself is taking place in software rather than chemistry and the software version is far less complex than the organic versions we observe, but it is the same process. This system isn’t modeling evolution, it’s implementing evolution.
There are GAs and EAs that do simulate some aspect of biological evolution, Thomas Schneider’s ev for example, but those are both simulations and exemplars.
It would be an advance if IDists could figure out how GAs work and then advance to discussing similarities and differences to biology. Since there are both similarities and differences, this could take the form of an aduld discussion.
Slow hand clap.
Next!
If Mung were genuinely interested in learning about Weasel rather than scoring cheap points against it, he could take my code, vary the parameters, and observe the results.
Besides allowing you to vary operational parameters like population size and mutation rate, the code also allows the intermediate results to be displayed with as much detail as desired.
// adjustable parameters
#define POPULATION_SIZE 200 // total population size
#define NUM_SURVIVORS 4 // number of survivors per generation (must be less than POPULATION_SIZE)
#define GENOME_LEN 28 // number of loci in each genome (each locus is one character)
#define MUTATION_RATE 0.01 // probability that a locus is mutated per generation
#define FITNESS_THRESHOLD 29 // program stops when this fitness threshold is met or exceeded
#define DISPLAY_INTERVAL 1 // display results every n generations, where n == DISPLAY_INTERVAL
#define PAUSE_TIME 50000 // pause time in microseconds
#define GENOMES_TO_DISPLAY 1 // in STEP_MODE, number of (the fittest) genomes to display at each step
#define STEP_MODE 1 // if set, program will pause every n generations and display the genomes
#define ENABLE_LATCHING 0 // if set, enable latching
#define ENABLE_SELECTION 1 // if set, enable selection; otherwise, select survivors randomly
keiths,
Or he could just follow the algorithm that Dawkins described in detail in “The Blind Watchmaker”. It only takes a lunch hour to implement.
Writing one from scratch forces you to understand it.
I’m not sure I have seen that claim, though perhaps I would not have noticed it.
You cannot model or simulate evolution without modeling or simulating the changing environment. But the environment is too hard and too unpredictable to model or simulate.
I take GAs to model some aspects of evolution. And what aspects they model presumably depends on the programmers goals. It seems to me that GAs can be useful for this, as long as we understand their limitations.
GA doubters should check out the Waze app.
Assume I am posting in good faith. If you can’t do that perhaps this is not the site for you.
I can assume you are posting in good faith and are also misaken.
No Mung did not fail his attempt to write a GA, and I don’t claim to be an authority.
But speaking of failures, you were invited to gather all your GA-writing friends to show us how it’s done and failed to do so.
I think you misunderstood the question. I don’t want to know what value to use for the population size, as you say that may vary depending on the problem.
I want to know, given an already coded GA, how do we determine the effective population size. In the example so graciously provided by keiths, I arrived at a figure of 1, because only one ‘organism’ survives to produce 200 offspring.
I want to know if my thinking is correct on that.
The idea here is to examine when drift ought to overcome the effects of selection and if it is not doing so, attempt to understand why not.
If Ne=1 why isn’t drift the dominant factor?
Are we really seeing “the power of cumulative selection” or are we seeing “the power of cumulative random sampling”?
I’m a maverick, like Salvador. 🙂
You may be posting in good faith, but have a history of being wrong. Just like WJM.
Fun with GAs has continued without your involvement in my threads Mung. I don’t know if that’s because they were beyond you, you didn’t like the results or some other reason.
I’d like to compare your mutation rate to the rate of fixation.
Yes. Yes. Even when I’m right I am wrong. That’s a given here at TSZ. I’ve learned to live with it and not accuse everyone who posts here of posting in bad faith.
Mung,
So we agree that KeithS is justified in not taking your word for it.
In the example GA provided by keiths, 199 members of the population get killed off and one member survives and produces 200 offspring to seed the next generation.
Can the selection coefficient be calculated from that information?
From the fact that you agree with keiths it does not follow that he is justified.
If someone has made X truth claims and Y have been right:
1) what are the odds that next (X+1) truth claim is correct?
2) what is a good threshold to accept claims from a source being true?
Rich:
Both (a) and (b).
Mung:
When you’re wrong, you’re wrong, and that happens so often that no one needs to pretend that you’re wrong when you’re right.
No. One member survives and is retained in the population, while 199 members are killed off and replaced by mutated copies of the survivor:
// preserve the survivors, but convert the rest into mutated copies of the survivors
for (i=0; i < POPULATION_SIZE – NUM_SURVIVORS; i++) {
mutate(&genome_array[i%NUM_SURVIVORS], &genome_array[i+NUM_SURVIVORS]);
}
Mung:
Are you expecting that Weasel will fail to converge if drift dominates?
Ah, I see what Mung is thinking — and he’s wrong, of course.
Mung, to Joe:
He’s hoping that Weasel somehow cheats and uses drift, not selection, to achieve its success.
That’s goofy. Without selection, Weasel won’t converge on its target. Mung can demonstrate that for himself by setting ENABLE_SELECTION to 0 and running my program.
He’s got a bad case of weaselphobia.
Sorry, Mung, but you’re going to have to suck it up. Weasel works, and it demonstrates the power of cumulative selection.
Not to mention the stupidity of demanding a “general purpose” GA or else evolution is false. Mung never ceases to entertain
Not only did he demand a general-purpose GA — he wants one that can solve any problem:
Of course, no one here claimed that there was such a GA.
Meanwhile, Mung is being chased by a Weasel.
The effective population size then is 1. The selection coefficient is basically infinite, in favor of any mutant that brings one closer to the target, because it is always chosen. So no problem satisfying 4Ns > 1.
If one wanted to try a lower selection coefficient, you’d need to modify the program so that newborns are evaluated as to how many changes they have that are closer to the target. Then one would put a weight on each offspring that was 1+s raised to the k-th power, where k was the number of favorable mutants the individual possessed.
Then sample one survivor from these newborns, where the sampling is weighted by that weight. This would make the model nearly a Wright-Fisher model. Movement toward the target would be slower. I’d predict that if s < 1/4 it would often not get very close. Raising N would help.
Methinks it is like a biomorph.
keiths,
Is this the same basic algorithm that you are using? Do you compare the mutation events against a target?
That’s the essential weasel algorithm, unless there’s some subtle typo I didn’t notice.
Richardthughes, there is a glaring problem with your claim and the claim of keiths and that is that keiths agrees with me. Maybe keiths is wrong too though.
Further, here you are in a prior thread:
You wouldn’t be one of those that thinks that GAs model evolution though.
Oh good. dazz showed up. Speak of entertaining:
A fine bit of pedantry!
Um, no. The question is, given an effective population size of 1, why doesn’t drift overcome the effects of selection.
petrushka,
I know this programs algorithm is 30 years old. Are there any newer cumulative mutation simulators that can find a sequence without a target?
I never made such a claim.
Drift takes over when the selective pressure is low. But here you have high selective pressure.
That is false. My claim was that it can’t be done.
AFIAK, no one claimed there wasn’t such a GA, except me. You have evidence to the contrary?
From my link in the OP:
In other words, in small populations, the stochastic effects of random genetic drift overcome the effects of selection.
Given a small effective population size, how high does it have to be before it has a greater effect than drift?
Quibble: Drift is always there but selection gets stronger once the selection coefficient is high enough. (In this case selection is extremely strong).
I answered that. s > 1/4, roughly.