In his endless pursuit of that wascally Weasel, Mung made the following silly claim:
GAs are often used to demonstrate “the power of cumulative selection.” Given small population sizes drift ought to dominate yet in GAs drift does not dominate.
That is clearly false, but for the benefit of Mung (and his cousin Elmer) I have modified my Weasel program to incorporate both drift and selection. They can now see for themselves that small population sizes are insufficient to guarantee that drift dominates selection.
The code is here. Compile it under Linux using “gcc -std=gnu99 -lm weasel.c -o weasel”.
Run the program and type ‘h’ to see a list of interactive commands:
c – clear the histogram data
f – change the selection coefficient
h – print this help message
m – change the mutation rate
p – pause until a key is pressed
q – quit the program
s – toggle selection on/off
t – change the target phrase
The program generates and updates a scaled histogram showing the number of generations spent at each possible Hamming distance from the target.
Worked first try. With the default selection coefficient it sees to get stuck, or at least slow down.
I need a brief description of what the histogram is showing. I find it confusing that it does not self clear.
Now I have a 64 bit executable. tomorrow I will make a 32 bit exe.
Is there a place to upload them?
Joe F:
In your program the adult of the next generation is not randomly chosen from the newborn pool. Your program has a single adult chosen not randomly but rather specifically, from which the newborn pool is then generated.
I ask again, what was the point of changing the population size and leaving the effective population size untouched?
petrushka,
Excellent!
Yes. With that selection coefficient, the equilibrium Hamming distance is far from a perfect match. In my runs, it’s 12.
The Hamming distance is a measure of how close the parent phrase is to the target phrase at any point in time. A Hamming distance of 0 means there is a perfect match. A Hamming distance of 1 means there is a one-character mismatch, and a Hamming distance of 28 means there are no matches.
Over time, the parent phrase will “wander” closer to the target, then away, then closer again. At each generation, the program notes the Hamming distance of the new parent and increments the corresponding bin in the histogram.
The peak in the histogram will occur where there is an equilibrium between (average) drift and (average) selection.
I haven’t done it myself. Some places won’t allow you to directly upload .exe files, so you might need to change the extension before uploading and then give instructions for renaming after download.
Mung:
There’s more than one kind of selection coefficient, Mung. Here’s Joe’s description of the relevant kind, from his online PopGen book:
Mung, describe in your own words what pick_survivor does.
Keiths, for anyone running on Windows in a DOS box you might need to set the box to a smaller font size. It’s best if you open cmd first, rather than just double clicking on the exe. The box does not automatically stretch to display all the lines.
Mung,
Yes, it is.
You don’t understand the code, do you?
How does the program know the target? How do you calculate the Hamming distance to an unknown target?
I’ll let Joe F analyze the math. The program behaves well and does what seems reasonable.
Mung, the code is right there. Take a look.
Okay. Mods, could one of you give me edit permissions on the OP so that I can add tips like this? I also want to update the code link to point to the latest version.
Mung:
petrushka:
Do some work of your own, Mung. We’re not here to spoon-feed you.
Tests, keiths. Tests.
Automated Tests as Documentation
If you understand your code, you can write tests. If you can’t write tests that demonstrate that your code meets the requirements that’s not my fault.
You are claiming that your revised weasel code demonstrates that my claim was false. It doesn’t.
Mung,
I’ve already tested my code, and I continue to test it as we go along.
But here’s your chance to shine. Write your own tests and find a fatal bug that, when fixed, shows that you were right all along and that
Good luck.
You do understand the program, don’t you?
I have the before and after versions of the code. I am looking at the code. Both versions of the code. If you have some other version of the code that contradicts anything I have posted about the code, do say so.
Post your tests. The code.
There’s more than one way to test a program. Think, Mung.
So? Which kind of selection coefficient does your program use and why did you choose that particular kind of selection coefficient?
Mung the code itself contradicts your claim regarding pick_survivors.
If you disagree, simply write out your analysis of how it works.
Subjective keiths.
petrushka,
Yes, please. I’ll get some popcorn.
This is an excellent Mung meltdown. Where’s Rich?
If this is in fact the case, you can write a test to demonstrate that your claim is true.
I too enjoy popcorn. Eating popcorn isn’t the best way to demonstrate that popcorn doesn’t exist.
keiths has no evidence to support his claims and calls out to Richardthughes for validation. Excellent!
So you don’t need to do anything to demonstrate your claim, but claims about your claims have to be demonstrated?
Can’t say I’m totally surprised.
Apart from, as already noted, the code itself.
As you are unwilling to explain the code, as you see it, the joke is on you.
Heh. Misdirect the people questioning your understanding to an irrelevant point. Check!
Generation: 25280 Number of survivors per generation: 20
Selection: ON Coefficient: 0.50 Mutation rate: 0.01
Organism 0 Fitness 3.79e+004 Hamming distance 2
METHI KS IT IS LIKJ A WEASEL
Histogram of Hamming distances:
0: 1372205943417473106 X
1: 864712057832082567
2: 366790486251605232
3: 201921448
4: 4672670202244390967
5: 36
6: 13
7: 0
Fighting a couple of fires at work.
First up, excellent code – job well done.
I’ve been trying to be less confrontational with Mung, I keep hoping that when we do these sort of things he’ll engage and embark on a journey of meaningful discovery. Instead we’ve had a Gallienesque “test your code!” again. Work through it, Mung. If it works, that’s okay!
You’re becoming Joe G. Stop it, if only for your own sake! Also, you can call me Rich if you’d like.
windows 64 bit executable:
http://www.filedropper.com/keiths
original source code:
http://www.filedropper.com/he14ehgn
petrushka,
After I complete the CAPTCHA, I get the following message (in a very small font) with no actual download:
File Details:
Filename: keiths.exe
Size: 0 KB, Type: exe
However, I can download the source code.
Possibly an anti-malware measure?
It worked for me.
I will try again later with renamed files.
Weasel version 02232016
Press ‘h’ at any time for help
Target Phrase: METHINKS IT IS LIKE A WEASEL
Generation: 17690 Number of survivors per generation: 1
Selection: ON Coefficient: 5.00 Mutation rate: 0.01
Organism 0 Fitness 2.82e+012 Hamming distance 12
YQTHIVXH QQ W LIMW AEWEASEL
Histogram of Hamming distances:
0: 0
1: 0
2: 0
3: 0
4: 0
5: 4
6: 193 XXXX
7: 664 XXXXXXXXXXXXXXX
8: 1260 XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
9: 1445 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
10: 1962 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
11: 2400 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
12: 2580 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
13: 2464 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
14: 1810 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
15: 1109 XXXXXXXXXXXXXXXXXXXXXXXXX
16: 769 XXXXXXXXXXXXXXXXX
17: 419 XXXXXXXXX
18: 167 XXX
19: 154 XXX
20: 37
21: 28
22: 77 X
23: 12
24: 9
25: 31
26: 8
27: 49 X
28: 39
Weasel version 02232016
Press ‘h’ at any time for help
Target Phrase: METHINKS IT IS LIKE A WEASEL
Generation: 17690 Number of survivors per generation: 2
Selection: ON Coefficient: 5.00 Mutation rate: 0.01
Organism 0 Fitness 1.02e+021 Hamming distance 1
METHINKS IT IS LIKE A WEASEC
Histogram of Hamming distances:
0: 3131 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
1: 5048 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
2: 5487 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
3: 2573 XXXXXXXXXXXXXXXXXXXXXXXXXXXX
4: 613 XXXXXX
5: 373 XXXX
6: 143 X
7: 9
8: 25
9: 5
10: 12
11: 47
12: 30
13: 28
14: 6
15: 53
16: 12
17: 36
18: 3
19: 0
20: 5
21: 1
22: 1
23: 4
24: 11
25: 13
26: 9
27: 12
28: 0
Weasel version 02232016
Press ‘h’ at any time for help
Target Phrase: METHINKS IT IS LIKE A WEASEL
Generation: 17690 Number of survivors per generation: 2
Selection: ON Coefficient: 2.50 Mutation rate: 0.01
Organism 0 Fitness 3.26e+012 Hamming distance 5
METHISKS IEJIS LIKEGA DEASEL
Histogram of Hamming distances:
0: 0
1: 0
2: 165 XX
3: 1373 XXXXXXXXXXXXXXXXXXXXXXXX
4: 1948 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
5: 3198 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
6: 3408 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7: 2939 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
8: 1805 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
9: 1180 XXXXXXXXXXXXXXXXXXXX
10: 846 XXXXXXXXXXXXXX
11: 389 XXXXXX
12: 94 X
13: 25
14: 7
15: 28
16: 9
17: 22
18: 13
19: 68 X
20: 70 X
21: 16
22: 25
23: 3
24: 6
25: 32
26: 21
27: 0
28: 0
Weasel version 02232016
Press ‘h’ at any time for help
Target Phrase: METHINKS IT IS LIKE A WEASEL
Generation: 17690 Number of survivors per generation: 4
Selection: ON Coefficient: 2.50 Mutation rate: 0.01
Organism 0 Fitness 1.71e+015 Hamming distance 0
METHINKS IT IS LIKE A WEASEL
Histogram of Hamming distances:
0: 13026 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
1: 3779 XXXXXXXXXXXXXXXXX
2: 406 X
3: 136
4: 40
5: 15
6: 20
7: 10
8: 35
9: 7
10: 14
11: 5
12: 5
13: 9
14: 27
15: 40
16: 23
17: 7
18: 3
19: 4
20: 5
21: 10
22: 13
23: 9
24: 1
25: 2
26: 16
27: 4
28: 19
petrushka,
OK, so it appears that the effectiveness of selection in the program now depends on selection coefficient and effective population size. Was that your point?
Math is not my strong point. I’m depending on you guys to interpret this.
The obvious answer to your question is that the program does what I expected it to do when I changed the parameters. Whether it is a useful program is above my pay grade.
works for me too
petrushka,
On second thought I’m not so sure that I understand the printout. What is the histogram a histogram of, that is what are the hamming distances of? What is organism 0? If that’s the starting point, then none of the runs seem to be showing selection, just random diffusion at a greater or lesser rate, with the mode still close to the start.
32 bit windows exe
http://www.filedropper.com/keiths32
I asked keiths about that earlier, and he responded.
Here’s my understanding.
The numbers to the left of the semicolon are distance from the target.
The numbers to the left of the Xs are the number of times the survivor/parent has been that far from the target. The histogram represents the relative frequency for any given distance from the target.
If you let it run a long time it resembles a bell curve.
With a low coefficient, it is supposed to wander. That’s the point. Drift prevents perfect and permanent achievement of the target. By changing the parameters, you can make it behave like Dawkins weasel.
John:
With the default parameter settings, ‘organism 0’ refers to whatever organism is chosen to be the parent of the next generation. If the number of survivors is changed to something other than one using the ‘v’ command, then organism 0 is just one of the survivors. The other survivors are not displayed, though that can be changed by modifying the GENOMES_TO_DISPLAY parameter and recompiling.
As petrushka explained, the histogram shows the amount of time (i.e. the cumulative number of generations) spent at a particular distance from the target. It is updated and scaled continuously as the program runs.
The genotypes are initialized randomly when the program starts, so the initial Hamming distance is high — typically 27. That means the action will initially occur toward the bottom of the histogram. With the default settings, selection will reduce the Hamming distance, pushing the activity upward until it settles into a dynamic equilibrium around 12. The histogram then “fills in” with the peak at the equilibrium distance.
You can then fiddle with the selection coefficient, the mutation rate, and the number of survivors (i.e. the effective population size) to see how the histogram responds.
To no one’s surprise except Mung’s, selection overpowers drift at some settings but not at others. That is, you are unlikely to achieve a perfect match (a distance of 0) with some settings, but virtually certain to achieve it with others.
What in each case is the starting position? What is the single organism 0 displayed for each run? It appears to me that organism 0 is your starting point, and what that would mean is that in each case there is a symmetrical binomial distribution around the starting point. If there were directional selection operating we should see a skew in the distribution in the direction of 0, which appears not to be the case. What we see here is selection entirely overcome by drift.
If I understand this at all.
John,
See this comment above.
A quick batch of tests indicates to me that at most settings, there is no meltdown. The genome reaches an equilibrium level of fitness which can be moved up or down, but it never latches and never fixes at either end of the fitness spectrum.
It’s always wobbling.
This little exercise has convinced me of something that has been percolating for a while.
It is, perhaps, useful to cease thinking of a target, and think instead of fitness in the abstract. The weasel phrase provides a numeric measure of fitness, but does not need to be thought of as a goal. It is okay to reach an equillibrium level of fitness and hover about it.
What doesn’t seem to happen is genetic meltdown. I’m sure there are parameters that will result in meltdown, but there seems to be a rather wide window of stability.
petrushka,
Yes, and that’s what’s expected in theory.
Joe derived an equation for the equilibrium point in terms of u (the mutation rate) and s (the selection coefficient), assuming an effective population size of one and an infinite number of offspring. I plan to see how well the program matches the equation over a range of u and s values.
Keiths, I think it would be fairly easy to calculate a mean and sd for each pair of coefficients and survivor number.
You could make an outer loop that runs through the permutations and prints out just the parameters and mean fitness. That could be imported into some graphing software.
Just to make mung happy.