Weasel Wars – Directed Evolution

Posted on February 27, 2016 by Mung

This is just an effort to help keep Joe’s thread focused and to help keep it from being derailed. People can use it or not. I hope they will.

CharlieM: Can someone explain to me, why is all of this not just a model of directed evolution? Surely it is set up to be directed towards a target?

Allan Miller: It gets towards the target by means of variation and selection of genotypes in the current population. The programmer does not direct it towards the target. Indeed, there would be no point in writing GAs for problem solution if it were simply a matter of specifying a target and directing the program to find it.

Consider a small modification – instead of distance from target, evaluate fitness by adding up the ASCII bits. Those with the greater sum are fitter than those with a lesser. There is no mention of a distant target – although it is clear that the program will converge on a string of all Zs, the program doesn’t know this. It is not drawn by, nor directed towards, that target. It is simply doing generational evaluation of fitter and less fit genotypes in the current population.

Discuss.

340 thoughts on “Weasel Wars – Directed Evolution”

Allan Miller on March 1, 2016 at 1:42 pm said:

fifthmonarchyman,

Directed evolution and “natural selection” are oxymorons in the view of the average materialist.

No they aren’t. Oxymoron being a contradiction in terms, right?

If, as several ID-enthusiasts here are saying, directed evolution and natural selection are effectively synonymous, I don’t know what the fuss is about.

Of course there is an additional possibility, that in addition to the effects of simple offspring number differential on frequencies, there is an entity that is differentially and actively choosing which individuals live and die – this entity is actually causing change in allele frequency. This isn’t an oxymoron either. It’s what happens in ‘artificial’ selection by humans, after all. An active agent is simply another potential force causing differential offspring numbers. Artificial selection is a subset of NS IMO.

To equate them smells of pantheism or deism (or Calvinism) and we can’t have that now can we.

To equate them smells of whatever the people trying to equate them think it smells of. No point being coy if that’s what they mean, but that’s not what they are saying. They are not saying that variants having differential offspring numbers in a particular environmental stress (cold, antibiotic, etc) are being sifted by an intelligence, but by the causal … agency, for want of a better word. The cold, the antibiotic, etc. There is no point denying this happens. It may or may not be all that happens, but it certainly happens.

To the average materialist the idea of the environment directing either directly or as the intermediary force of another being is pure heresy.

If the environment causes differential success, that is entirely orthodox.

Just another drive-by answer.
Alan Fox on March 1, 2016 at 2:06 pm said:

CharlieM: There you have it, environmental conditions directing evolution. You have already agreed on this type of directed evolution. Natural selection is just an effect the environment has on its inhabitants.

@ CharlieM

Catching up, and just to reiterate what Allan Miller has been saying, you can consider the environment as the designer in natural selection. I’m happy if you want to call “natural selection” by the alternative name of “environmental design”. So neither do I disagree with your quoted remarks.
petrushka on March 1, 2016 at 2:27 pm said:

Allan Miller:
CharlieM,
So you think ALL evolution by natural selection is ‘directed’ evolution, with environment as director. So all GAs with a selection routine are directed in that sense. So they are implementing Natural Selection. So why are we arguing?

I see no argument.

I don’t understand why this kind of discussion keeps getting started.

Except that people like Denton keep waving the pinata of Destiny over the heads of IDists.

Well, it’s an ignorant idea. For any given genome in biology there are thousands, perhaps millions, of functional genomes just one standard size mutation away. It’s the Robert Frost Road That Might Be Taken.

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

All the alleles in a population are — by definition — nearly the same in fitness. But they lead in different directions, and once one allele becomes dominant in a population, the way forward for that population is constrained.

Looking back, it seems like an intentional choice, and the results seem designed.

Frost was being ironic.
Mung on March 1, 2016 at 6:34 pm said:

Mung:
# Dear Programmer, don’t be fooled
# The code you write has no effect on the outcome of the program
ALPHABET_SIZE = 27
PHRASE_LENGTH = 28
char_set = (32..126).map {|i| i.chr}.shuffle.slice(0, ALPHABET_SIZE)
target_phrase = PHRASE_LENGTH.times.map {char_set.sample}

Added a comments to ensure the programmer was aware that he wasn’t doing the programming and therefore had no control over anything the program actually does.
CharlieM on March 1, 2016 at 8:12 pm said:

Allan Miller:
CharlieM,

I am simply offering my opinion, as a consumer of your prose, that your analogies serve more to confuse – not just your reader, but you.

There is a tension between your desire to communicate and anyone else’s willingness to put the effort in to comprehend. When someone says “it’s like …”, and it’s not like that at all, much time can be wasted talking about something which isn’t in fact the thing being discussed. But it’s your dollar.

Well whoever puts in the effort to come to the conclusion that “it’s not like that at all”, could perhaps then make use of their effort by explaining to me why it’s not like that at all. That way we all learn from the error of my ways.
CharlieM on March 1, 2016 at 8:22 pm said:

Allan Miller:
CharlieM,

So you think ALL evolution by natural selection is ‘directed’ evolution, with environment as director. So all GAs with a selection routine are directed in that sense. So they are implementing Natural Selection. So why are we arguing?

Well would you argue with my belief that ALL evolution by natural selection is very similar to the evolution of new breeds of animal by artificial selection? In fact artificial selection can serve as a model for natural selection. They are both limited in that they alter the character of the populations being changed but these populations still remain within the type. The only difference being natural selection is more subtle than artificial selection.
Flint on March 1, 2016 at 8:48 pm said:

CharlieM: Well would you argue with my belief that ALL evolution by natural selection is very similar to the evolution of new breeds of animal by artificial selection?

I think your belief is absolutely correct. As far as the animals’ breeding success is concerned, you ARE the environment. And your selection coefficient is very very high.

In fact artificial selection can serve as a model for natural selection. They are both limited in that they alter the character of the populations being changed but these populations still remain within the type. The only difference being natural selection is more subtle than artificial selection.

I wouldn’t use the word “subtle” here. Natural selection is different only in the sense that the selection coefficient is much smaller, so phenotypes change much more slowly.

As for “remaining within type”, I don’t know what this means. Yes, all offspring of beetles will be beetles. All offspring of mammals will be mammals. But if you mean dogs and cats are different “types”, you have to explain what differences you consider important and why. There was a time in the past when dogs and cats had a common ancestor. Offspring from that ancestor were the same “type” (and still are) unless and until we rather arbitrarily say they aren’t.
Flint on March 1, 2016 at 8:52 pm said:

Mung: Added a comments to ensure the programmer was aware that he wasn’t doing the programming and therefore had no control over anything the program actually does.

Your struggles to deliberately misunderstand and misinterpret get comical after a while. The programmer controls what the program does, but this does NOT mean the programmer knows what the result will be. If he knew, why write a program at all? Your “argument” is like saying a pocket calculator should come with a warning that whoever calculates with it has no control over what it does, simply because that person doesn’t know the result before the calculation is performed.
Mung on March 1, 2016 at 10:24 pm said:

Flint: The programmer controls what the program does, but this does NOT mean the programmer knows what the result will be. If he knew, why write a program at all?

Do you think Richard Dawkins had some doubt in his mind about whether his Weasel program would find the target phrase? Yet he wrote the program anyways.

Allan suggested writing a program where he knew the outcome would be all Zs.

People hounded me to write my own Weasel program. I did so, even when I knew what the outcome would be. Then they laughed at it because I exposed the Weasel for what it was.

People write programs all the time knowing what the outcome will be.

https://rosettacode.org/wiki/Evolutionary_algorithm

Funny how people seem to think Weasel is the canonical GA.

Also, you don’t seem to know it, but you are taking my side. You are essentially making the same claim I was making. The programmer may not know the specific outcome in advance, but that doesn’t mean the programmer doesn’t control what the program does, including finding the target.
Mung on March 1, 2016 at 10:25 pm said:

While the parent is not yet the target…

LoL.
Allan Miller on March 1, 2016 at 10:36 pm said:

CharlieM,

Well whoever puts in the effort to come to the conclusion that “it’s not like that at all”, could perhaps then make use of their effort by explaining to me why it’s not like that at all. That way we all learn from the error of my ways.

That is rather the point. People who persist in engineering or programming analogies for chemistry are unlikely to be persuaded that it is not, in fact, engineering or programming, no matter what I say. Better, in my opinion, to study the systems themselves, than to be sidetracked by what they are a bit like.

But persuading you to share that opinion is beyond my rhetorical skill.
Allan Miller on March 1, 2016 at 10:48 pm said:

CharlieM,

Well would you argue with my belief that ALL evolution by natural selection is very similar to the evolution of new breeds of animal by artificial selection? In fact artificial selection can serve as a model for natural selection.

Well … yeah. I presume you are familiar with the work of one C. Darwin in that regard? But, as I say, AS is actually a subset of NS. NS is all changes in allele frequency that result from a consistent correlation between genotype (via phenotype) and survival/breeding. That includes traits which appeal to breeders. It is not separate from them.

They are both limited in that they alter the character of the populations being changed but these populations still remain within the type.

The limitation is only in the supply of new mutations in a given time period. Dogs aren’t throwing up new species because, inter alia, their effective population sizes are very small, due in no small part to strong directional selection. Smaller effective population sizes means fewer new mutations. They aren’t a particularly good model for wild populations and long-term (million-year) time frames. There is no limitation inherent in NS (or its stochastic counterpart, drift). Mutation can move indefinitely through genotype space without ever being contained by a ‘type’. Or at least, no-one has ever been able to point to a reflective barrier or other mechanism that prevents indefinite modification, for any species. The hoary old ‘they are still dogs’ argument does not really fly. Was the ancestor from a million years ago a dog? Two million? Ten? A billion?
Allan Miller on March 1, 2016 at 11:00 pm said:

Mung,

Allan suggested writing a program where he knew the outcome would be all Zs.

No, I suggested writing a program using the bitsum as a selection criterion. It so happens that I know that the maximum bitsum is all Z’s. I can compute the genotype of the maximally fit genome. That does not make all Z’s the ‘target’ of this program. It is simply the sole occupant of the fitness peak (in that example).

How about a program where the fittest genome is a 28-letter string of bitsum 1891, and closer approaches to this, in either direction, are more likely to survive? That happens to be the bitsum of Weasel***, but many phenotypes/genotypes will satisfy that. What’s the target now? I have no idea what the winning string will be (though I do know it will have a bitsum of 1891, because I can do some analysis).

Not all finishing states can be anticipated by analysis. So is it the existence of a computable ~~finishing~~ static state that makes a GA directed? That would seem a bit stupid – I might not even be clever enough to work out that all Z’s is the fittest genotype. Is it finishing states as computed by rilly clever people that makes a GA directed?

Or is the ‘known finishing state’ issue (also known as ‘target’) not something of a hareng rouge?

*** eta – 1891 is the hex sum, not the bit sum.
keiths on March 1, 2016 at 11:01 pm said:

Mung,

People hounded me to write my own Weasel program.

Oh, those nasty people! They urged you to code? How awful that must have been!

I did so, even when I knew what the outcome would be. Then they laughed at it because I exposed the Weasel for what it was.

No, we laughed at it because it exposed you:

Mung,

The code you provided confirms that you don’t understand Weasel or cumulative selection at all.

For the benefit of folks who don’t read Ruby, here’s what Mung’s program does:

1. Create a single genotype and initialize it randomly from the set of all unique characters contained in “METHINKS IT IS LIKE A WEASEL”.

2. Start with the first character in the random genotype. Keep mutating that character and only that character until it matches the “M” in “METHINKS”. Never change it again.

3. Repeat step 2 for the second character, then the third character, and so on until the end of the genotype is reached, at which point the entire genotype will match the target phrase.

That’s it, believe it or not.

Jesus, Mung, no wonder you thought that Weasel would have trouble with changing targets. Your program certainly would.

At least you copied the target phrase correctly.

Why not study my code so that you’ll understand what the rest of us are talking about? Or Google “Dawkins weasel” and read some of the informal descriptions of the algorithm?
Flint on March 1, 2016 at 11:51 pm said:

Mung: Do you think Richard Dawkins had some doubt in his mind about whether his Weasel program would find the target phrase? Yet he wrote the program anyways.

People write programs all the time knowing what the outcome will be.

Funny how people seem to think Weasel is the canonical GA.

Dawkins’ program was not intended to be a GA of any sort. It was a pedagogical exercise to demonstrate the power of selection. And you seem to be equivocating about the word “outcome”. If I write a program to add a long column of large numbers, I know the outcome will be the sum. I don’t know what the sum will be. So which outcome are you talking about?

Also, you don’t seem to know it, but you are taking my side. You are essentially making the same claim I was making. The programmer may not know the specific outcome in advance, but that doesn’t mean the programmer doesn’t control whatthe program does, including finding the target.

Your posts are generally ambiguous in this way. Yes, the programmer knows what the program does and knows how it does so. Again, you seem to be equivocating between the “target” as a specific result, and the “target” as a description of the sort of result expected. Biologists know that evolution will produce organisms, in endless variety, all suited to some niche where they can survive. If this is the “target”, then yes evolution has a target. But nobody can predict specifically what those organisms will be in full detail, or what those niches will be either.
Flint on March 2, 2016 at 12:00 am said:

Allan Miller:
No, I suggested writing a program using the bitsum as a selection criterion. It so happens that I know that the maximum bitsum is all Z’s.

What do you mean by “bitsum”? If we are talking about the ASCII characters A..Z in uppercase, then you will converge on either ‘O’ or ‘W’, both of which have 5 set bits (hex values 4F or 57). ASCII ‘Z’ has only 4 set bits (hex 5A).

If we are talking about the entire range of ASCII codes, then you will converge on ‘DEL’ (value 7F).
keiths on March 2, 2016 at 12:04 am said:

I think he meant something more like a checksum, in which case all Z’s would in fact maximize it.
Flint on March 2, 2016 at 12:14 am said:

keiths:
I think he meant something more like a checksum, in which case all Z’s would in fact maximize it.

Can you explain? That would not resemble any checksum I’m familiar with. A straight checksum is calculated by adding together all the ASCII values in a sequence, ignoring carries. There is certainly no guarantee that the result will be anything close to a maximum number of set bits.

A CRC is more complex but still has nothing to do with maximizing set bits.

(And what is his initial population? If it is the characters in ‘WEASEL’ there is no ‘Z’ in the alphabet. If it’s the entire ASCII range, his method would produce a different result. Or does he intend the full 00-FF range of extended ASCII?)
keiths on March 2, 2016 at 12:29 am said:

Flint,

Can you explain? That would not resemble any checksum I’m familiar with. A straight checksum is calculated by adding together all the ASCII values in a sequence, ignoring carries. There is certainly no guarantee that the result will be anything close to a maximum number of set bits.

That’s right. I’m just saying that if he were maintaining a 2-byte checksum for a 28-byte character string formed from the Weasel character set (all upper-case alphabetic characters plus the space character), then the checksum would be at its maximum for a string containing all Z’s.
Flint on March 2, 2016 at 12:40 am said:

keiths:
Flint,

That’s right.I’m just saying that if he were maintaining a 2-byte checksum for a 28-byte character string formed from the Weasel character set (all upper-case alphabetic characters plus the space character), then the checksum would be at its maximum for a string containing all Z’s.

This is very misleading. IF we stipulate that the value of 5A hex (132 decimal) is the largest permitted value, this will produce a 2-byte maximum value as the checksum ONLY because there is no overflow ((28 * 132) < 65536). But then, it's not a checksum, it's only a sum. Why not say "maximize the numerical sum"?

I THINK he's saying that if one were to sum a column of N numerical values, the maximum such sum would be reached when every number in the column was the highest numerical value permitted.
Mung on March 2, 2016 at 12:47 am said:

Allan Miller: That does not make all Z’s the ‘target’ of this program. It is simply the sole occupant of the fitness peak (in that example).

One fitness peak. One occupant. Even if we draw a circle around it after you hit it it would still be a target.
keiths on March 2, 2016 at 12:47 am said:

This is very misleading. IF we stipulate that the value of 5A hex (132 decimal) is the largest permitted value, this will produce a 2-byte maximum value as the checksum ONLY because there is no overflow ((28 * 132) < 65536). But then, it's not a checksum, it's only a sum.

There’s no rule requiring checksums to overflow, Flint. What makes a checksum a checksum is the fact that it is used to check the integrity of the associated data.
Mung on March 2, 2016 at 12:50 am said:

Flint: Dawkins’ program was not intended to be a GA of any sort.

Seriously?

https://rosettacode.org/wiki/Evolutionary_algorithm
Flint on March 2, 2016 at 12:57 am said:

keiths: There’s no rule requiring checksums to overflow, Flint.What makes a checksum a checksum is the fact that it is used to check the integrity of the associated data.

Sigh. If there is no overflow, then a simple sum and a “checksum” are identical. If there WERE overflow in this example, then a sum would STILL be what produced the string of Zs, and a checksum would not.

(And a simple sum is just as good an integrity check as a simple checksum, except it might require more bytes to implement).
Flint on March 2, 2016 at 1:02 am said:

Mung: Seriously?

https://rosettacode.org/wiki/Evolutionary_algorithm

Yes, seriously. A true genetic algorithm does not have a pre-specified target. It has a “target region” AKA a description of the sort of target desired. If the WEASEL program were modified to converge on, say, any grammatical sentence that includes the maximum number of the initial population of letters, it would be much closer to a GA.

But in my book, if the target is pre-specified in full detail, nothing is actually being searched or solved. Dawkins’ sole purpose was to say “hey, look how enormously much faster we reach the target if selection is used.” And Dawkins SAID that his purpose wasn’t to model evolution, only to show that pure random chance was a misunderstanding, probably deliberate.
Mung on March 2, 2016 at 1:03 am said:

Flint: If it is the characters in ‘WEASEL’ there is no ‘Z’ in the alphabet.

Don’t say that in front of keiths. He gets very upset when you point out that the size of the search space can be significantly reduced just by taking out the characters not in the target phrase and by only generating candidate solutions with the actual relevant letters. I thought that was a nice touch that my Weasel program added.

All it takes to make a Weasel phrase:

[” “, “A”, “E”, “H”, “I”, “K”, “L”, “M”, “N”, “S”, “T”, “W”]

🙂
Flint on March 2, 2016 at 1:08 am said:

Mung: Don’t say that in front of keiths. He gets very upset when you point out that the size of the search space can be significantly reduced just by taking out the characters not in the target phrase and by only generating candidate solutions with the actual relevant letters. I thought that was a nice touch that my Weasel program added.

All it takes to make a Weasel phrase:

[” “, “A”, “E”, “H”, “I”, “K”, “L”, “M”, “N”, “S”, “T”, “W”]

I have a question for you (and I don’t know the answer!). Let’s say we keep increasing the initial alphabet, one letter (not in the weasel phrase) at a time. Will the additional time required be more-or-less linear (x iterations per added letter), or will it be something like log e?
Mung on March 2, 2016 at 1:15 am said:

Flint: A true genetic algorithm does not have a pre-specified target. It has a “target region” AKA a description of the sort of target desired.

Don’t tell keiths. He thinks his Weasel program is a real GA.

So a “real GA” has a “target region.” So does the Weasel program.

Within that “target region” how many targets must there be for it to be a “real GA”?

Frankly, there’s not much sense in using a GA if there’s only a single target region. A point my Weasel program made quite explicit. But that was lost on keiths. For you see, for keiths, and other like him, Weasel must be a GA else it doesn’t serve much of any useful purpose at all.

So here, again, you are actual arguing on my side. LoL.
keiths on March 2, 2016 at 1:24 am said:

Flint,

I think the only thing we’re actually disagreeing about is your claim that if there’s no overflow, it isn’t a checksum.

If the sum is used to confirm the integrity of the data, then it’s a checksum. Whether it overflows is irrelevant. A 0x0000 checksum at the end of a block containing all zeros is still a checksum.

Even in a context where an overflow isn’t possible for any data pattern, the sum is still a checksum.

What makes it a checksum is not the fact that it overflows. It’s the fact that it can be used to check the integrity of the data.
Flint on March 2, 2016 at 1:34 am said:

keiths:
Flint,

I think the only thing we’re actually disagreeing about is your claim that if there’s no overflow, it isn’t a checksum.

Not quite. I will try once more. A checksum MAY overflow, a simple sum may not. If the string CAN overflow the bucket, it MAY NOT produce all Zs, whereas a simple sum ALWAYS WILL. And that’s the difference. In the general application, it may not be known in advance whether overflow will happen. If a checksum is implemented, then the result in such cases is not predictable. If a sum is implement, the result is fully defined.

In my experience, it is Very Bad Programming to ASSUME that boundary conditions will never be exceeded. Sooner or later, someone will use that code with a data set that exceeds the boundary condition, never realizing such a condition is built in.

NOTE that I’m not saying that a sum cannot be used as an integrity check. Of course it can – and at the margin it even does a better job.

( However, both a simple sum and a checksum aren’t usually used for integrity checks because it’s too easy for errors to cancel and offset. So a CRC is preferred. Which is beside the point of the convergence.)
keiths on March 2, 2016 at 1:35 am said:

Mung,

He gets very upset when you point out that the size of the search space can be significantly reduced just by taking out the characters not in the target phrase and by only generating candidate solutions with the actual relevant letters.

That’s odd. I don’t recall getting very upset about that. You wouldn’t be making stuff up again, would you?

I remember laughing about it, because it was a classic and klutzy Mung reversal.

You had just demanded to know why I restricted the offspring to the characters in the target phrase. When I informed you that there was no such restriction, you demanded to know why there wasn’t!

Typical confused Mungishness.
Flint on March 2, 2016 at 1:42 am said:

Mung: Don’t tell keiths. He thinks his Weasel program is a real GA.

So a “real GA” has a “target region.” So does the Weasel program.

No. What I was trying to say (you are goddamn hard to talk to) is that a “target region” BY DEFINITION includes some fairly large number of possible solutions. In Dawkins’ weasel program, there is only one, single, possible solution. Please note that you carefully omitted my definition of a “target region”, and THEN you redefined it as something else!!

Within that “target region” how many targets must there be for it to be a “real GA”?

Enough so that the programmer has essentially no chance of predicting the specific solution generated.

Frankly, there’s not much sense in using a GA if there’s only a single target region. A point my Weasel program made quite explicit. But that was lost on keiths. For you see, for keiths, and other like him, Weasel must be a GA else it doesn’t serve much of any useful purpose at all.

Of course, Dawkins wasn’t trying to produce his target sentence so he could be surprised by it. He was showing that it could be produced within a short time using selection IN ADDITION to random mutation. And that’s why I don’t consider it a GA. But my notion may be idiosyncratic.

Weasel DID INDEED serve a useful purpose – to show the power of selection. NOT to produce fascinating new words or sentences.
keiths on March 2, 2016 at 1:46 am said:

Flint,

Here’s what you wrote:

IF we stipulate that the value of 5A hex (132 decimal) is the largest permitted value, this will produce a 2-byte maximum value as the checksum ONLY because there is no overflow ((28 * 132) < 65536). But then, it’s not a checksum, it’s only a sum.

Your last sentence is false. Even if there’s no possibility of overflow, it’s still a checksum.

Many checksums overflow. That doesn’t mean that they have to overflow in order to qualify as checksums.
Flint on March 2, 2016 at 1:49 am said:

keiths:
Flint,

Here’s what you wrote:

Your last sentence is false.Even if there’s no possibility of overflow, it’s still a checksum.

Many checksums overflow.That doesn’t mean that they have to overflow in order to qualify as checksums.

OK, if you wish to use “sum” and “checksum” interchangeably, there’s nothing I can say to you. I pointed out that there IS IN FACT a potentially meaningful difference in the implementations, but this doesn’t seem to penetrate. I pointed out that IF there is an overflow, then the prediction of all Zs will fail for that very reason. You ignored that too. I give up.

A SUM was the intended implementation. In this case, a 2-byte (but not a 1-byte) checksum works out exactly like a sum. But that is a coincidence.
keiths on March 2, 2016 at 1:54 am said:

Flint,

OK, if you wish to use “sum” and “checksum” interchangeably, there’s nothing I can say to you.

I’m not using them interchangeably. Many sums are not checksums.

I’m simply pointing out that a sum needn’t overflow in order to qualify as a checksum. That is where we disagree.
Flint on March 2, 2016 at 1:59 am said:

keiths:
Flint,

I’m not using them interchangeably.Many sums are not checksums.

I’m simply pointing out that a sum needn’t overflow in order to qualify as a checksum. That is where we disagree.

No, we don’t disagree. If it just so happens that a checksum does NOT overflow, then in effect it is a simple sum. Which also works as an integrity check.

So I will say this one more time. IF there is no overflow, the result will be a string of all Z characters in this case. If there IS an overflow, then a SUM will STILL produce a string of all Zs, and a checksum will not.

In this case, the target was a string of all Zs. Using an incorrect implementation that JUST HAPPENS to produce the correct result in this case, doesn’t make that implementation correct. Sheesh.
Mung on March 2, 2016 at 2:06 am said:

Flint: I pointed out that IF there is an overflow, then the prediction of all Zs will fail for that very reason.

That’s what Allan gets for not actually getting in and coding it.
keiths on March 2, 2016 at 2:12 am said:

Flint,

I think you would agree that this is simple and obvious stuff. It would be very surprising if we disagreed on the technical details, which are straightforward.

Our disagreement is semantic. You claim that overflow is an essential characteristic of a checksum:

IF we stipulate that the value of 5A hex (132 decimal) is the largest permitted value, this will produce a 2-byte maximum value as the checksum ONLY because there is no overflow ((28 * 132) < 65536). But then, it’s not a checksum, it’s only a sum. [Emphasis added]

I disagree, because I think it’s obvious that a checksum is still a checksum even if it fails to overflow.

If you retract the bolded sentence, then I think we are back in agreement.

Sheesh, indeed.
Flint on March 2, 2016 at 2:19 am said:

keiths:
Our disagreement is semantic.You claim that overflow is an essential characteristic of a checksum:

OK, I think you have pinpointed the disagreement. For me, if there is NO POSSIBILITY of overflow, this is different from if there IS a possibility of overflow. EVEN IF in a particular instance, no overflow occurs. I don’t claim overflow is an essential characteristic of a checksum, I claim that the POTENTIAL for overflow is essential, and is the sole distinction between a sum and a checksum.

In the particular case we’re looking at, a sum guarantees the predicted result NO MATTER WHAT. A checksum mimics the sum so long as there is no overflow.

The target in this case REQUIRES that no overflow CAN EVER take place.. I think this distinction is important. You apparently do not.
Flint on March 2, 2016 at 2:27 am said:

Flint: Read what keiths and I have been saying. If Allan had coded it using the ASCII (NOT EBCDIC) range A..Z, and used a 2-bit checksum, he’d get a string of all Zs because his string is too short for the difference between a sum and a 2-byte checksum to interfere. If he had used a 1-byte checksum, however, he would NOT have produced a string of all Z characters. In any case, Allan didn’t talk about either a sum or a checksum, he talked about a “bitsum”, which would produce something very different from his predicted string of Z characters.
keiths on March 2, 2016 at 2:33 am said:

Flint,

OK, I think you have pinpointed the disagreement. For me, if there is NO POSSIBILITY of overflow, this is different from if there IS a possibility of overflow.

No, that isn’t the disagreement. Of course there’s a difference between a possibility of overflow and no possibility of overflow.

The disagreement is over the meaning of “checksum”. To me, a sum qualifies as a checksum if it can be used to check the integrity of the accompanying data. To you, it’s only a checksum if there is a possibility of overflow.

I think that’s silly. A sum that performs the function of a checksum is a checksum. Overflow is not essential to that function. The check works either way.

Overflow occurs in many checksums, but it is not an essential characteristic.
Flint on March 2, 2016 at 2:53 am said:

keiths:
Flint,

No, that isn’t the disagreement. Of course there’s a difference between a possibility of overflow and no possibility of overflow.

The disagreement is over the meaning of “checksum”. To me, a sum qualifies as a checksum if it can be used to check the integrity of the accompanying data.To you, it’s only a checksum if there is a possibility of overflow.

I think that’s silly.A sum that performs the function of a checksum is a checksum. Overflow is not essential to that function.The check works either way.

Overflow occurs in many checksums, but it is not an essential characteristic.

I’m smelling doubletalk here. You agree the only difference between a sum and a checksum is overflow. You agree that if there is overflow, these produce different results. You agree that a sum and a checksum are not the same thing.

And then you post, over and over, that they ARE the same thing in some cases, which are those cases where there is no overflow!

So I’m going to sign off here with my definition: a checksum CAN overflow, a sum can NOT overflow. Whether or not they actually do in some implementation is irrelevant. If the possibility of overflow IS the difference between a sum and a checksum, and if that possibility is the ONLY difference, then BY YOUR OWN DEFINITION overflow is THE ONE AND ONLY essential characteristic of a checksum.

(But I’m going to guess that you are using the word “checksum” to indicate any and every possible way of validating the integrity of a data set – it might be a sum, or a CRC, or a bitsum, it might be XORING together all the bytes, it might be an encryption where there is (or maybe isn’t?) a decryption key or password, etc. If this is your definition of a checksum, you aren’t making it clear.)
keiths on March 2, 2016 at 3:12 am said:

Slow down, Flint. You’re confusing what you want me to have said with what I actually have said.

I’m smelling doubletalk here. You agree the only difference between a sum and a checksum is overflow.

No, I don’t. Many checksums can overflow, but that is not an essential characteristic.

You agree that if there is overflow, these produce different results.

Most of the time, yes.

You agree that a sum and a checksum are not the same thing.

I certainly agree that many sums are not checksums. Checksums are sums, however.

And then you post, over and over, that they ARE the same thing in some cases, which are those cases where there is no overflow!

Huh? All checksums (of the kind we’re talking about) are sums. That’s why the word ‘sum’ appears at the end of ‘checksum’. Whether they happen to overflow is a separate question.

If the possibility of overflow IS the difference between a sum and a checksum…

No.

…and if that possibility is the ONLY difference…

No.

…then BY YOUR OWN DEFINITION overflow is THE ONE AND ONLY essential characteristic of a checksum.

No, because your antecedents are incorrect.

I think the discussion will go smoother if you’ll stick to responding to my actual words rather than to your paraphrases of them.
Flint on March 2, 2016 at 3:31 am said:

keiths:
Slow down, Flint. You’re confusing what you want me to have said with what I actually have said.

No, I don’t. Many checksums can overflow, but that is not an essential characteristic.

Most of the time, yes.

I certainly agree that many sums are not checksums.Checksums are sums, however.

Huh?All checksums (of the kind we’re talking about) are sums.That’s why the word ‘sum’ appears at the end of ‘checksum’. Whether they happen to overflow is a separate question.

No.

No.

No, because your antecedents are incorrect.

I think the discussion will go smoother if you’ll stick to responding to my actual words rather than to your paraphrases of them.

I am doing my best to guess what you mean. I’m guessing wrong every time, so this is like talking to Mung.

So I guess I need to ask. What exactly do you see as the difference between a checksum (which may overflow) and a sum (which may not overflow) if the difference is not the possibility of overflow?
keiths on March 2, 2016 at 3:49 am said:

Flint,

I am doing my best to guess what you mean. I’m guessing wrong every time, so this is like talking to Mung.

Why guess when you can reread my own words? I’m not being cryptic — you just need to resist the temptation to project your own desired meanings into what I’m saying.

So I guess I need to ask. What exactly do you see as the difference between a checksum (which may overflow) and a sum (which may not overflow) if the difference is not the possibility of overflow?

As I just explained, it is not true that sums cannot overflow. Checksums — including the ones that can overflow — are sums. Please note the last three letters in the word ‘checksum’.

And as I also just explained:

The disagreement is over the meaning of “checksum”. To me, a sum qualifies as a checksum if it can be used to check the integrity of the accompanying data. To you, it’s only a checksum if there is a possibility of overflow.

Are you actually reading my comments, or are you just skimming them?
Flint on March 2, 2016 at 4:07 am said:

keiths: Are you actually reading my comments, or are you just skimming them?

I am equally convinced that you have simply ignored all the comments I’ve made that don’t fit what you wish to hear.

So let me try a different approach. The goal is for the program to converge on a list of all Z characters. The character set is ASCII A..Z. The Z character has the highest numerical value. In order for the string of all Zs to occur, what we wish to do is SUM all of the characters. A list of all Zs maximizes the SUM.

Now, a 1-byte checksum will NOT produce the target list if there is more than one character in the list, due to overflow. A 2-byte checksum will NOT produce the desired list if the number of characters becomes larger than 496 characters, again due to overflow. THEREFORE, for our program to produce the correct result for a list of ANY size, a checksum is the wrong implementation. Only a SUM will hit the target.

(And I don’t know what you mean by “it is not true that sums cannot overflow.” A sum is a numerical total of all the elements being added together. It cannot overflow by definition. If the programmer intends a numerical sum and does not guard against overflow, he will produce the WRONG ANSWER. To me, getting the answer wrong makes a difference. It would make NASA very unhappy as well.)

So on to the implementation phase. Take a bucket N bits wide, and start adding values to it. Now, what happens when there is a carry? Well, you can implement your sum using more bytes (so as not to lose data), or you can ignore the carry to keep your bucket small. But these are different implementations, and can produce different results.

When you say that sums can overflow, that checksums might not be able to overflow, that checksums are sums, but that they are different, this is doubletalk. Sums by definition are sums – they are actual numerical totals. There can be no overflow, or else the total is WRONG. I don’t know how to make this any plainer.
petrushka on March 2, 2016 at 4:42 am said:

I can’t think of a likely implementation of weasel where the sum of the bits in the population string would overflow any likely data type.
keiths on March 2, 2016 at 6:44 am said:

Flint,

As I said earlier, I doubt very much that we are going to disagree on the technical details of deploying checksums. They’re too obvious and straightforward.

Yes, of course a one-byte checksum would not be maximized by a 28-letter ASCII sequence of all Z’s. It’s blindingly obvious. That’s why I specified a two-byte checksum right at the beginning!

The disagreements I see (and there are now two of them) are semantic:

1. You claim that a checksum is not really a checksum unless there is a possibility of overflow. That’s silly, for reasons I’ve already given.

2. You now claim that a sum isn’t a sum if there is overflow. That’s also silly.

The word ‘sum’ is routinely used to refer to the results of an addition even when there is overflow. As a programmer, I’m surprised you don’t know this.

[And just to be clear, I’m using “overflow” in the less formal sense that encompasses both signed and unsigned arithmetic.]

Examples from the Web:

The rules for detecting overflow in a two’s complement sum are simple:

If the sum of two positive numbers yields a negative result, the sum has overflowed.

If the sum of two negative numbers yields a positive result, the sum has overflowed.

Otherwise, the sum has not overflowed.

And:

In each case, we compute the sum, and note if there was an overflow.

And:

The half adder adds two input bits and generates a carry and sum…

And:

If we allow the program to add a two bit number and remember the carry for later, then use that carry in the next sum the program can add any number of bits the user wants even though we have only provided a two-bit adder.

And:

If the sum of two numbers with the sign bits off yields a result number with the sign bit on, the “overflow” flag is turned on.

Need I go on?

Like it or not, a checksum is a sum, even when there is overflow.
Allan Miller on March 2, 2016 at 8:54 am said:

keiths,

I think he meant something more like a checksum, in which case all Z’s would in fact maximize it.

Yes, sorry to be confusing. When first suggested, I really did mean the sum of bits, but then flipped to the sum of hex values. Hence all Z’s being maximal. Lazy terminology. Sorry for providing derail opportunities!
Allan Miller on March 2, 2016 at 8:58 am said:

Mung,

One fitness peak. One occupant. Even if we draw a circle around it after you hit it it would still be a target.

Yes, well done. In a situation where there is only one occupant of a fitness peak, there is indeed only one occupant of a fitness peak.

Are you therefore saying that the concept of a ‘target’ applies only to programs that stop changing at a single convergent value, and no other program (such as one with multiple peaks, or multiple sequences at the same fitness) has a ‘target’?