# Gpuccio’s Theory of Intelligent Design

Gpuccio has made a series of comments at Uncommon Descent and I thought they could form the basis of an opening post. The comments following were copied and pasted from Gpuccio’s comments starting here

To onlooker and to all those who have followed thi discussion:

I will try to express again the procedure to evaluate dFSCI and infer design, referring specifically to Lizzies “experiment”. I will try also to clarify, while I do that, some side aspects that are probably not obvious to all.

Moreover, I will do that a step at a time, in as many posts as nevessary.

Creating CSI with NS
Posted on March 14, 2012 by Elizabeth
Imagine a coin-tossing game. On each turn, players toss a fair coin 500 times. As they do so, they record all runs of heads, so that if they toss H T T H H H T H T T H H H H T T T, they will record: 1, 3, 1, 4, representing the number of heads in each run.

At the end of each round, each player computes the product of their runs-of-heads. The person with the highest product wins.

In addition, there is a House jackpot. Any person whose product exceeds 1060 wins the House jackpot.

There are 2500 possible runs of coin-tosses. However, I’m not sure exactly how many of that vast number of possible series would give a product exceeding 1060. However, if some bright mathematician can work it out for me, we can work out whether a series whose product exceeds 1060 has CSI. My ballpark estimate says it has.

That means, clearly, that if we randomly generate many series of 500 coin-tosses, it is exceedingly unlikely, in the history of the universe, that we will get a product that exceeds 1060.

However, starting with a randomly generated population of, say 100 series, I propose to subject them to random point mutations and natural selection, whereby I will cull the 50 series with the lowest products, and produce “offspring”, with random point mutations from each of the survivors, and repeat this over many generations.

I’ve already reliably got to products exceeding 1058, but it’s

possible that I may have got stuck in a local maximum.

However, before I go further: would an ID proponent like to tell me whether, if I succeed in hitting the jackpot, I have satisfactorily refuted Dembski’s case? And would a mathematician like to check the jackpot?

I’ve done it in MatLab, and will post the script below. Sorry I don’t speak anything more geek-friendly than MatLab (well, a little Java, but MatLab is way easier for this)

Now, some premises:

a) dFSI is a very clear concept, but it can be expressed in two different ways: as a numeric value (the ratio between target space and search space, expressed in bits a la Shannon; let’s call that simply dFSI; or as a cathegorical value (present or absent), derived by comparing the value obtained that way with some pre define threshold; let’s call that simply dFSCI. I will be specially careful to use the correct acronyms in the following discussion, to avoid confusion.

b) To be able to discuss Lizzie’s example, let’s suppose that we know the ratio of the target space to the search space in this case, and let’s say that the ratio is 2^-180, and therefore the functional complexity for the string as it is would be 180 bits.

c) Let’s say that an algorithm exists that can compute a string whose product exceeds 10^60 in a reasonable time.

If these premises are clear, we can go on.

Now, a very important point. To go on with a realistic process of design inference based on the concept of functionally specified information, we need a few things clearly definied in any particulare example:

1) The System

This is very important. We must clearly define the system for which we are making the evaluation. There are different kinds of systems. The whole universe. Our planet. A lb flask. They are different, and we must tailor our reasoning to the system we are considering.

For Lizzie’s experiment, I propose to define the system as a computer or informational system of any kind that can produce random 500 bits strings at a certain rate. For the experiment to be valid to test a design inference, some further properties are needes:

1a) The starting system must be completely “blind” to the specific experiment we will make. IOWs, we must be sure that no added information is present in the system in relation to the specific experiment. That is easily realized by having the system assembled by someone who does not know what kind of experiment we are going to make. IOWs, the programmer of the informational system just needs to know that we need random 500 bits string, but he must be completely blind to why we need them. So, we are sure that the system generates truly random outputs.

1b) Obviously, an operator must be able to interact with the system, and must be able to do two different things:

- To input his personal solution, derived from his presonal intelligent computations, so that it appears to us observers exactly like any other string randomly generated by the system.

- To input in the system any string that works as an executable program, whose existence will not be known to us observers.

OK?

2) The Time Span:

That is very important too. There are different Time Spans in different contexts. The whole life of the universe. The life of our planet. The years in Lenski’s experiment.

I will define the Time Span very simply, as the time from Time 0, which is when the System comes into existence, to Time X, which is the time at which we observe for the first time the candidate designed object.

For Lizzie’s experiment, it is the time from Time 0 when the specific informational system is assembled, or started, to time X, when it outputs a valid solution. Let’s say, for instance, that it is 10 days.

OK?

3) The specified function

That is easy. It can be any function objectively defined, and objectively assessable in a digital string. For Lizzies, experiment, the specified function will be:

Any string of 500 bits where the product calculated as described exceeds 10^60

OK?

4) The target space / search space ratio, expressed in bits a la Shannon. Here, the search space is 500 bits. I have no idea how big the target space is, and apparently neither does Elizabeth. But we both have faith that a good mathemathician can compute it. In the meantime, I am assuming, just for discussion, that the target space if 320 bits big, so that the ratio is 180 bits, as proposed in the premises.

Be careful: this is not yet the final dFSI for the observed string, but it is a first evaluation of its higher threshold. Indeed, a purely random System can generate such a specified string with a probability of 1:2^180. Other considerations can certainly lower that value, but not increase it. IOWs, a string with that specification cannot have more than 180 bits of functional complexity.

OK?

We must observe, in the System, an Object at time X that was not present, at least in its present arrangement, at time 0.

The Observed Object must comply with the Specified Function. In our experiment, it will be a string with the defined property, that is outputted by the System at time X.

Therefore, we have already assessed that the Observed Object is specified for the function we defined.

OK?

6) The Appropiate Threshold

That is necesary to transorm our numeric measure of dFSI into a cathegorical value (present / absent) of dFSCI.

In what sense the threshold has to be “appropriate”? That will be clear, if we consider the purpose of dFSCI, which is to reject the null hypothesis if a random generation of the Oberved Object in the System.

As a preliminary, we have to evaluate the Probabilistic Resources of the system, which can be easily defined as the number of random states generated by the System in the Time Span. So, if our System generates 10^20 randoms trings per day, in 10 days it will generate 10^21 random strings, that is about 70 bits.

The Threshold, to be appropiate, must be of many orders of magnitude higher than the probabilistic resources of the System, so that the null hypothesis may be safely rejected. In this particular case, let’s go on with a threshold of 150 bits, certainly too big, just to be on the safe side.

7) The evaluation of known deterministic explanations

That is where most people (on the other side, at TSZ) seem to become “confused”.

First of all, let’s clarify that we have the duty to evaluate any possible deterministic mechanism that is known or proposed.

As a first hypothesis, let’s consider the case in which the mechanism is part of the System, from the start. IOWs the mechanism must be in the System at time 0. If it comes into existence after that time because of the deterministic evolution of the system itself, then we can treat the whole process as a deterministic mechanism present in the System at time 0, and nothing changes.

I will treat separately the case where the mechanism appears in the system as a random result in the System itself.

Now, first of all, have we any reason here to think that a deterministic explanation of the Observed Object can exist? Yes, we have indeed, because the nature itself of the specified function is mathemathical and algorithmic (the product of the sequences of heads must exceed 10^60). That is exactly the kind of result that can usually be obtained by a deterministic computation.

But, as we said, our System at time 0 was completely blind to the specific problem and definition posed by Lizzie. Therefore, we can be safely certain that the system in itself contains not special algorithm to compute that specific solution. Arguing that the solution could be generated by the basic laws physics is not a valid alternative (I know, some darwinist at TSZ will probably argue exactly that, but out of respect for my intelligence I will not discuss that possibility).

So, we can more than reasonably exclude a deterministic explanation of that kind for our Observed Object in our System.

7) The evaluation of known deterministic explanations (part two)

But there is another possibility that we have the duty to evaluate. What if a very simple algorithm arose in the System by random variation)? What if that very simple algorithm can output the correct solution deterministically?

That is a possibility, although a very ulikely one. So, let’s consider it.

First of all, let’s find some real algorithm that can compute a solution in reasonable time (let’s say less than the Time Span).

I don’t know if such an algorithm exists. Im my premise c) at post #682 I assumed that it exists. Therefore, let’s imagine that we have the algorithm, and that we have done our best to ensure that it is the simplest algorithm that can do the job (it is not important to prove that mathemathically: it’s enough that it is the best result of the work of all our mathemathician friends or enemies; IOWs, the best empirically known algorithm at present).

Now we have the algorithm, and the algorithm must obviously be in the form of a string of bits that, if present in the System, wil compute the solution. IOWs, it must be the string corresponding to an executable program appropriate for the System, and that does the job.

We can obviously compute the dFSI for that string. Why do we do that?

It’s simple. We have now two different scanrios where the Observed Object could have been generated by RV:

7a) The Observed Object was generated by the random variation in the System directly.

7b) The Observed Object was computed deterministically by the algorithm, which was generated by the random variation in the System.

We have no idea of which of the two is true, just as we have no idea if the string was designed. But we can compute probabilities.

So, we compute the dFSI of the algorithm string. Now there are two possibilities:

- The dFSI for the algorithm string is higher than the tentative dFSI we already computed for the solution string (higher than 180 bits). That is by far the most likely scenarion, probably the only possible one. In this case, the tentative value of dFSI for the solution string, 180 bits, is also the final dFSI for it. As our threshold is 150 bits, we infer design for the string.

- The dFSI for the algorithm string is lower than the tentative dFSI we already computed for the solution string (lower than 180 bits). There are again two possibilities. If it is however higher than 150 bits, we infer design just the same. If it is lower than 150 bits, we state that it is not possible to infer design for the solution string.

Why? Because a purely random pathway exists (through the random generation of the algorithm) that will lead deterministically to the generation of the solution string, with a total probability of the whole process which is higher than our threshold (lower than 150 bits).

OK?

8) Final considerations

So, some simple answers to possible questions:

8a) Was the string designed?

A: We infer design for it, or we infer it not. In science, we never know the final truth.

8b) What if the operator inputted the string directly?

A: Then the string is designed by definition (a conscious intelligent being produced it). If we inferred design, our inference is a true positive. If we did not infer design, our inference is a false negative.

8c) What if the operator inputted the algorithm string, and not the solution string?

A: Nothing changes. The string is designed however, because it is the result of the input of a conscious intelligetn operator, although an indirect input. Again, if we inferred design, our inference is a true positive. If we did not infer design, our inference is a false negative. IOWs, our inference is completely independent from how the designer designed the string (directly or indirectly)

8d: What if we do not realize that an algorithm exists, and the algorithm exists and is less complex than the string, and less complex than the threshold?

A: As alreday said, we would infer design, at least until we are made aware of the existence of such an algorithm. If the string really originated randomly througha random emergence of the algorithm, that would be a false positive.

But, for that to really happen, many things must become true, and not only “possible”:

a) We must not recognize the obvious algorithmic nature of that particular specified function.

b) An algorithm must really exist that computes the solution and that, when expressed as an executable program for the System, has a complexity lower than 150 bits.

I an absolutely confident that such a scenario can never be real, ans so I believe that our empirical specificity of 100% will be always confirmed.

Anyways, the moment that anyone shows tha algorithm with those properties, the deign inference for that Object is falsified, and we have to assert that we cannot infer design for it. This new assertion can be either a false negative or a true negative, depending on wheterh the solution string was really designed (directly or indirectly) or not (randomly generated).

That’s all, for the moment.

AF adds “This was done in haste. Any comments regarding errors and ommissions will be appreciated.”

## 263 thoughts on “Gpuccio’s Theory of Intelligent Design”

1. Joe: Differential reproduction due to heritable random variation (mutation)= natural selection

The “random” is extraneous. Nor is mutation the only source of variation. Natural selection can occur when there are existing variations in a population, regardless of whether there is a source for novel variations.

2. I said randomly generated genomes. I’d say the chance that Lizzie’s program generated 100 strings of all 0′s at random are about the same as her generating CSI.

Nice try, Mung, but your own words betray you:

There is more to a GA exploring certain kinds of digital space than differences due to relative fitness (usually defined by a fitness function or map), mutation and (optionally) recombination.

For example, potential solutions must be encoded into a “chromosome.”

Encoding potential solutions into a chromosome implies there is a problem to be solved.

Information about which potential solutions are more likely to solve the problem must be implemented. [emphasis mine]

The bolded statements are both wrong, for the reason I already gave:

Suppose Lizzie’s program initialized the genomes to all 0′s. Then there would be no potential solutions among the initial genomes, yet the program would still converge to a solution.

This shows that you don’t need to encode potential solutions into the chromosome, and you don’t need to implement “information about which potential solutions are more likely to solve the problem.”

P.S. I think you’re being unnecessarily modest by refusing to nominate yourself for your own Junk for Brains Award. You’ve earned it, Mung.

P.P.S. Please remind your buddy Upright Biped that Reciprocating Bill has some questions for him, and that Allan and I have refuted the latest version of his “semiotic argument for ID”.  Upright seems to be afraid of defending his argument, as usual.

3. Mung: Today’s Junk for Brains winner is Zachriel, who chide’s ID’ers for “assuming that evolutionary processes are no better than random assembly” while appealing to random assembly by a random process such as recombination.

Mung: I guess Zachriel can speak for himself, but randomness (as stochasticity) is of course part of the evolutionary process – recombination and mutation for sure, but stochastic influences on allele frequency changes come into it as well. The mistake would be to assume that this means that evolutionary ‘search’ is simply a matter of pulling sequences out of a metaphorical bag until, with a probability of 1 in n (the number of possible sequences) the desired sequence is hit. A random walk including random recombinations of existing sequence isn’t the same as random ‘assembly’ repeatedly from scratch.

If you don’t make that mistake, well done you, but many people, from Fred Hoyle onwards, have.

• How many times do you need to point out that evolution doesn’t test the universe of sequences, but just the next door neighborhood?

How many times do you need to point out that the mere existence of alleles refutes the vertical cliff charicature of sequence landscape?

Recombination is a larger scale mutation than base point change, but it is still a walk in the neighborhood.

4.  Zachriel: The “random” is extraneous.

Joe: Only to equivocators, like yourself.

Hmm. You provided this definition on your own blog: “Natural selection is the result of differences in survival and reproduction among individuals of a population that vary in one or more heritable traits.”

Natural selection occurs on existing variation. Consider a population of moths, some of which are white, and others are black. Natural selection might occur if white moths are preferentially eaten by birds, leaving the black moths to leave offspring. This is true regardless of the original source of the variation.

Zachriel: Natural selection can occur when there are existing variations in a population.

Joe: Yes, it can.

There you are then.

5. Mung: Zachriel, who chide’s ID’ers for “assuming that evolutionary processes are no better than random assembly” while appealing to random assembly by a random process such as recombination.

We admit, our language was poorly chosen. Thought it was clear in context, but you may not have followed the entire thread. By random assembly, we meant where each sequence is completely randomized with respect to previous sequences. Now that the point has been clarified, perhaps you would like to respond to our actual point.

Z: That’s the error of IDists. They assume that evolutionary processes are no better than searching completely randomized sequences—but that’s simply not the case. If you recombine workable protein sequences, you are much more likely to find a new workable protein sequence than searching completely randomized sequences.

6. Mung: There is at least the possibility that a solution will be found among the first 100 randomly generated genomes, though she doesn’t actually check to see if that is the case.

Zachriel: Yes, that the nature of randomness and its fit to a landscape.

Mung: iow, I’m right. you know it. But you don’t have the guts to tell keiths.

There doesn’t seem to be any disagreement between our position and keiths’. There’s apparently some confusion on your use of the word “solution”.

Mung: For example, potential solutions must be encoded into a “chromosome.”

Your statement appears to imply that someone is “encoding” solutions into the “chromosome”. If the sequences are randomized, assuming we are fitting solutions to a landscape of some sort, then some may naturally fit better, albeit probably poorly, than others.

Perhaps you are referring to the nature of the landscape. Some landscapes are such that evolutionary algorithms don’t work well, or don’t work at all. That depends on the specifics, of course, but nearly all the objections raised by kairosfocus, gpuccio, Mung and others are so general as to apply to all landscapes. For instance, recombination is a very powerful mechanism across many rugged landscapes, and evolutionary algorithms work millions of times faster for such landscapes than simply choosing random sequences one after another.

Mung: Every chromosome generated by the GA is a potential solution. Else what is the point of generating them?

Of course they are *potential* solutions, though solution may or may not be a single entity, but a best fit, for instance.

Allan Miller: In a GA you could start with a string of length zero

Mung: What would a string of length zero consist of?

∅. You sound like someone who was just told about the existence of zero. Word Mutagenation usually starts with the single-letter word “O”.
http://www.zachriel.com/mutagenation/Sea.asp

• Mung

What would a string of length zero consist of?

I presume you mean ‘what’s the real-world equivalent?’ rather than ‘how would you code it?’, but just in case, an example of a string of length zero in VBA would be one which returned zero to the VBA LEN function. In COBOL, a group with a next level OCCURS DEPENDING ON X where X is set to zero. I’m sure many languages offer the same kind of thing. The null, the empty set, the nothing. While the ‘replication’ function of biological replicators is a vital part of the string, that role is taken by the copy method in a GA, so the strings themselves don’t actually need to consist of anything at the start. The point of bringing them up is to point out that such strings are not likely to be ‘solutions’ to any worthwhile GA, so you aren’t necessarily ‘pre-seeding’ the population with anything.

So consider the zero-length digital organism as the absolute minimal replicator common to all GAs. As long as a method exists that occasionally adds random bits to a string, something will soon emerge, and variations between these ‘non-null’ bit-strings can be evaluated by the selection module. A set of strings of length zero evidently cannot vary, but they can still ‘compete’ via drift. You can still replicate and remove strings of length zero from a population.

But you could learn something about evolution by observing the behaviour of such populations, particularly inevitable coalescence of ancestry, before building up to something more elaborate. That’s the whole point of modelling, to reduce to essentials then reconstruct. The behaviour of finite replicating populations with no selection, mutation or recombination tells you a lot about the role of replication.

(I do know that such organisms do not actually exist…).

Lizzie could easily have started from a string of length zero. If no possibility of extension existed, it would not work. But given a proportion of mutations that add bases (just like reality), the nature of her fitness function would mean that strings would simply extend forever. But if a ‘length cap’ were also part of the fitness function – the longer a string, the more likely to break and die, say – then provided it was not so punitive as to to disallow 500-bit strings, a 500-bit string with product > 10^60 could easily be generated, even from a null string. More generally, the GA would be expected to converge on the highest product available to strings of a length at or just below a ‘breakiness threshold’ – an ‘optimal’ length where the reward for higher products is counterbalanced by the penalty for length.

Try it.

7. Mung: Today’s Junk for Brains winner is Zachriel

We really appreciate the honor, but we have already won the coveted Once Twice Thrice Quadrice Banned Award.

8. gpuccio,

It is completely wrong to model NS using IS, because they have different form and power.

As I said, you help me to refine my concepts, and I appreciate that.

Before someone states that I am changing arguments, I would suggest that you read again my original definitions of IS and NS, from which this statement can very clearly be derived:

“d) NS is different from IS (intelligent selection, but only in one sense, and in power:

d1) Intelligent selection (IS) is any form of selection where a conscious intelligent designer defines a function, wants to develop it, and arranges the system to that purpose. RV is used to create new arrangements, where the desired function is measured, with the maximum possible sensitivity, and artificial selection is implemented on the base of the measured function. Intelligent selection is very powerful and flexible (whatever Petruska may think). It can select for any measurable function, and develop it in relatively short times.

d2) NS is selection based only on fitness/survival advantage of the replicator. The selected function is one and only one, and it cannot be any other. Moreover, the advantage (or disdvantage, in negative selection) must be big enough to result in true expansion of the mutated clone and in true fixation of the acquired variation. IOWs, NS is not flexible (it selects only for a very tiny subset of possible useful functions) and is not poweful at all (it cannot measure its target function if it is too weak).

This seems to be getting to the essence of our disagreement, especially when combined with your following comment:

A distinction without a difference. The model shows that the mechanisms of the modern synthesis are quite capable of generating functional complexity in excess of that required by your dFSCI.

This is exactly the type of wrong statement that has prompted me to analyze in detail this issue. Have you read my post #910 in the old thread? Please, refer to it for any following discussion on this.

Yes, I re-read your 910 where you discuss what level of functionality is selectable. I find your thresholds to be arbitrarily selected, but that’s not relative to the essential difference I think we’re finding.

What I see is you focusing on the details of how a model is implemented rather than on the concepts being modeled. Yes, in many GAs the environment is modeled via a fitness function that is designed to accomplish some goal and the threshold for terminating the simulation is set (independently of the fitness function and other components of the GA) to recognize when that goal has been reached. None of that changes the fact that the GA is a model of an observed, natural process.

It doesn’t matter if you label the model “intelligent selection” to somehow distinguish it from “natural selection”, what matters is that the pertinent mechanisms of the model are the same as those we observe in real biological systems.

Heritable variation with differential reproductive success does, demonstrably, generate large amounts of functional complexity, according to your own definition. The only reason not to consider the results of these mechanisms of the modern synthesis to have dFSCI is because you define dFSCI in terms of knowledge about the provenance of the results and you define those mechanisms as “deterministic”.

If you disagree, and I suspect you do, please explain why your distinction between “intelligent” and “natural” selection has any bearing on what is being modeled rather than the details of how the model is implemented.

9. gpuccio,

I spent some time yesterday looking through the Tierra project and it does appear to meet your criteria for what you think is a proper model of natural selection. Before I go further with it, I would like a clarification from you.

I’m curious, how would you measure functional complexity in such an environment? Would it simply be the length in bits of the digital organisms? If an organism with sufficient functional complexity to meet your dFSCI threshold were to appear, would you consider it to have dFSCI or would the fact that it arose through evolutionary mechanisms, which might even be tracked mutation by mutation, mean that the dFSCI medal could never be earned?

It’s easy. I would proceed like Lenski. I would “freeze” (copy) the virus periodically to examine its code. If and when any functional string of code expresses a new function that helps the virus to reproduce, and therefore partially or totally replace the simpler version, then it will be easy enough to evaluate the funtional complexity of that new string of code, with the ususal methods detailed at the beginning of your thread at TSZ.

What “usual methods”? How, exactly, would you compute the functional complexity of a digital organism in Tierra? Patrick noted in the threads he referenced on Mark Frank’s blog that Tierra results in a number of different reproductive strategies, including parasitism and hyper-parasitism. What is the functional complexity of those organisms?

• During the first 20,000 generations in the Lenski experiment, mutations occurred that were neutral with regard to citrate metabolism, but which turned out to be crucial after subsequent mutations.

How does the intelligent selector identify  and promote precursor changes?

10. gpuccio: d1) Intelligent selection (IS) is any form of selection where a conscious intelligent designer defines a function, wants to develop it, and arranges the system to that purpose.

A clarification please. What do you mean by “system”? If you mean the entire simulation, then obviously that would preclude any and all simulations.

11. To onlooker (at TSZ):

Any fitness function in any GA is intelligent selection, and in no way it models NS.

Please, do not consider any more that statement. Keiths is right, it was a wrong generalization.

Thank you for saying that. I appreciate your willingness to admit error.

d2) NS is selection based only on fitness/survival advantage of the replicator. The selected function is one and only one, and it cannot be any other… IOWs, NS is not flexible (it selects only for a very tiny subset of possible useful functions)…

Fitness functions measure and reward fitness, by definition. But there are many, many, fitness functions, not just one. We have to select the right fitness function for the application.

A fitness function that measures and rewards whiteness is fine if we are modeling a scenario in which whiteness contributes to survival and reproduction, as it does in the evolution of arctic hares. Not so much if we are modeling the evolution of tortoises, or running a GA that optimizes antenna designs.

A fitness function that rewards shell strength is fine if we are modeling a scenario in which shell strength contributes to survival and reproduction, as it does in the evolution of tortoises. Not so much if we are modeling the evolution of arctic hares.

A fitness function that rewards “product of run lengths in a sequence of 1′s and 0′s”, as in Lizzie’s example, is perfectly acceptable if we are modeling a hypothetical world in which “product of run lengths” contributes to survival and reproduction. Not so much if “primeness of run lengths” is the true criterion.

These are all Darwinian processes. The point of Lizzie’s example is to show how a Darwinian process can solve a problem (and generate dFSCI) without any information from the fitness function other than “better” or “worse”. Real world Darwinian evolution can also solve problems (and generate dFSCI) without any information from the environment other than “better” (you survived and produced lots of viable offspring) and “worse” (you died early or failed to reproduce for some other reason).

Your claim that there can be “one and only one [fitness function], and it cannot be any other” is false. There are many possible fitness functions. Some of them lead to the production of dFSCI, others don’t. The question is not whether it is legitimate to use other fitness functions. The question is whether a particular fitness function is legitimate for the scenario being modeled.

12. keiths: Suppose Lizzie’s program initialized the genomes to all 0?s. Then there would be no potential solutions among the initial genomes

Mung: That is false. Again, you demonstrate that you don’t understand what is being discussed. They would still be a potential solution. Just not a good solution. Just not an actual solution. A string of 500 0′s is still in the search space.

But I was reminded of a challenge I had issued. That challenge consisted in setting all strings to the same initial value, rather than having them randomly generated.

So please, have Lizzie initialize all her starting population of strings to all 0′s. By all means. Let’s see how well it performs then.

Mung – I thought you’d written your own version that took 10 seconds? Surely you could try the amendment yourself.

Here’s my prediction: a uniform starting population of all-0′s will be little different from a completely variable one in its ability to search the space. They will initially all be the same, and all products will be zero, therefore the fitness function will have nothing to select on, therefore the population will simply ‘drift’, replicating the same monotonous point in space. But if mutation occurs, variation will arise, the fitness function gains traction, and the ‘random (stochastic) walk’ has taken its first baby steps.

As I have said elsewhere, you could start with one digital ‘organism’ of bit-length zero and still find the peak, provided the mutation method includes something that can increase the number of bits in a string – a biologically ‘real’ amendment.

This is the relevance of ‘being able to write a GA’ to evolution. Or more so, to run one and play with it and see what happens when you fiddle with the various subroutines – the selection, mutation, recombination methods – and their parameters. They are all at least intended to duplicate the ‘real’ processes of the evolutionary synthesis. If they don’t, you need to be able to explain to the profs using them why they are barking up the wrong tree – and you can’t do that if you don’t even know what you are talking about. Give it a whirl – twiddle the knobs; turn them up, down or completely off – it won’t bite you.

• So please, have Lizzie initialize all her starting population of strings to all 0′s. By all means. Let’s see how well it performs then.

Mung – I thought you’d written your own version that took 10 seconds? Surely you could try the amendment yourself. Here’s my prediction: a uniform starting population of all-0′s will be little different from a completely variable one in its ability to search the space. They will initially all be the same, and all products will be zero, therefore the fitness function will have nothing to select on, therefore the population will simply ‘drift’, replicating the same monotonous point in space. But if mutation occurs, variation will arise, and the fitness function gains traction.

On the off chance that Mung doesn’t want to try this with his own code for some reason, I just ran the test. If you get him to make a prediction, I’ll share my results and see who of the two of you is closest.

13. Mung:

Again, you demonstrate that you don’t understand what is being discussed. They would still be a potential solution. Just not a good solution. Just not an actual solution.

A string of 500 0′s cannot be a solution. Something that cannot be a solution is not a “potential solution.”

A string of 500 0′s can be mutated over time into a solution, but we know without a doubt that a string of 500 0′s is not a solution. It’s not a “good solution.” It’s not an “actual solution.” It’s not a “potential solution”. The only kind of solution it is is a “non-solution.”

A string of 500 0′s is still in the search space.

Obvioulsy [heh], and if that’s all you meant by “potential solution” then I would have no objection. However, you clearly think that information has to be smuggled into the initial genomes, as your full statement reveals:

There is more to a GA exploring certain kinds of digital space than differences due to relative fitness (usually defined by a fitness function or map), mutation and (optionally) recombination.

For example, potential solutions must be encoded into a “chromosome.”

Encoding potential solutions into a chromosome implies there is a problem to be solved.

Information about which potential solutions are more likely to solve the problem must be implemented. [emphasis mine]

This is reinforced by your challenge:

But I was reminded of a challenge I had issued. That challenge consisted in setting all strings to the same initial value, rather than having them randomly generated.

So please, have Lizzie initialize all her starting population of strings to all 0′s. By all means. Let’s see how well it performs then.

I’m very curious. Why do you think it would be a problem if the initial genomes were set to all 0′s?

Have you thought this through?

14. Mung: Darwinian evolution does not need “the right fitness landscape” to work. (What would a “wrong” fitness landscape look like?)

The vast majority of conceivable landscapes are not amenable to evolutionary algorithms, such as highly chaotic or random landscapes. Landscapes that are amenable to evolutionary algorithms usually exhibit local structure. Indeed, some IDers argue that protein fitness landscapes are too rugged for evolution to be effective.

15. Mung: GAs work with a coding of the parameter set, not the parameters themselves.

Well, yes. That’s the genetic part of genetic algorithms, which are a subset of evolutionary algorithms. So? What did you think it meant?

16. Mung,

You issued this challenge:

But I was reminded of a challenge I had issued. That challenge consisted in setting all strings to the same initial value, rather than having them randomly generated.

So please, have Lizzie initialize all her starting population of strings to all 0′s. By all means. Let’s see how well it performs then.

If, as you claim, you don’t think that Lizzie smuggled information into the genomes by initializing them randomly, then what is the point of your challenge?

After you answer, go ahead and take your version of Lizzie’s program, set the initial genomes to all 0′s, and let us know what happens. We’ll see how it compares to Patrick’s results.

It should be amusing.

P.S. I see you’re also confused about fitness landscapes. Here’s a thought: Wouldn’t it make sense to learn about evolution and GAs before condescending to people who actually understand them?

17. Mung: “However I freely admit it is not an exact duplicate of Lizzie’s program just written in another language, I rather attempted to capture the “spirit” of what she built. “

You don’t understand what Elizabeth is trying to do, do you?

Instead of coding right away, why don’t you just describe, in English, what you think you are attempting to do with your exercise.

18. Weclome DrBot

Apologies. Your comment, like everyone’s first comment, was held in moderation and I only just spotted it. Where’s Lizzie!

19. Mung quotes Allan Miller:

While the ‘replication’ function of biological replicators is a vital part of the string, that role is taken by the copy method in a GA, so the strings themselves don’t actually need to consist of anything at the start. The point of bringing them up is to point out that such strings are not likely to be ‘solutions’ to any worthwhile GA, so you aren’t necessarily ‘pre-seeding’ the population with anything.

So consider the zero-length digital organism as the absolute minimal replicator common to all GAs. As long as a method exists that occasionally adds random bits to a string, something will soon emerge, and variations between these ‘non-null’ bit-strings can be evaluated by the selection module. A set of strings of length zero evidently cannot vary, but they can still ‘compete’ via drift. You can still replicate and remove strings of length zero from a population.     [Emphasis Mung's]

Mung, the GA expert, then presumes to ‘correct’ Allan by quoting from a book:

: Introduction to Evolutionary Computing

The choice of representation forms an important distinguishing feature between different streams of evolutionary computing. From this perspective GAs and ES can be distinguished from (historical) EP and GP according to the data structure used to represent individuals. In the first group this data structure is linear, and it’s length is fixed, that is, it does not change during a run of the algorithm.    [Emphasis Mung's]

Mung apparently hopes that none of us have heard of Google. Google ‘GA variable length chromosome’ and you get page after page of links to authoritative descriptions and discussions of GAs that fit the bill.

Oops.

20. You’re right, Toronto. Mung has no clue.

Two days ago, he announced his version of Lizzie’s program with typical Mungian bombast, even grandly labeling his program ‘Mung World’:

Mung World

ok, so I created my own version of Lizzie’s program.

took less than 10 seconds
1522 generations

What’s the big deal?

Allan and Patrick called Mung’s bluff, asking him to respond to his own programming challenge. Suddenly the bluster turned into excuses, apologies and requests for advice:

Mung World

I tossed my program together in a short evening. I am actually rather pleased with it, I even managed to make it object-oriented (for the most part).

However I freely admit it is not an exact duplicate of Lizzie’s program just written in another language, I rather attempted to capture the “spirit” of what she built.

It’s a bit rough around some of the edges, but I would like suggestions on how it can be improved.

I call my digital organisms LiddleLizzards, in honor of Elizabeth.

Here’s my LiddleLizzard class. I think the first thing that can use improvement is the mutate method, it’s pretty rough. .

# mutates this chromosome
def mutate
chromosome[rand(500)-1] = ’1′
end

All I do here is set one position in the chromosome to a ’1′. If it’s a zero it gets changed, if it’s a ’1′ it’s like a neutral mutation. I don’t know what that cashes out to in terms of a mutation rate, if someone wants to tell me.

Some potential modifications:

1. Set the chosen locus to either a zero or a one, that would not be too difficult to code.

2. Explicitly set the mutation rate.

3. Create a Mutation object that is passed in when the digital organism is created that encapsulates it’s mutation parameters.

4. Pass in the length of the string to generate rather than hard-coding it in a constant.

Honest evaluation, criticism, and suggestions for improvement are welcomed. You can leave comments at that link as well.

I had a look at the code for Mung’s LiddleLizzard class. It’s atrocious, and it bears no resemblance to Lizzie’s program.

His ‘mutate’ method sets a random bit in the chromosome to 1. It never sets bits to 0. That’s right — Mung’s program latches! KF will be apoplectic.

Mung’s fitness function looks for the longest sequence of consecutive 1′s in the chromosome. The length of that longest sequence is the fitness value. That’s it. No kidding.

That’s just the class definition. I’d hate to see the rest of the code.

Either Mung has absolutely no idea what Lizzie’s program does, or he doesn’t know to code. Or both.

21. Darwinian evolution does not need “the right fitness landscape” to work. (What would a “wrong” fitness landscape look like?)

Your problem, keiths (and apparently the problem of a few others over there at TSZ), is that you don’t know what a fitness landscape represents.

We’ll need to clarify what is meant by “landscape.” To me a fitness landscape isn’t something that is there waiting to be discovered (or climbed, ala Mount Improbable), it’s something that is created as populations evolve.

I see you’re also confused about fitness landscapes. Here’s a thought: Wouldn’t it make sense to learn about evolution and GAs before condescending to people who actually understand them?

It wasn’t entirely clear what sort of landscape(s) he was talking about, So I decided to wait and find out. You, otoh, plow ahead unabated.

• Mung:

Think about what a fitness landscape for a 64 bit encryption key would look like – you have 18,446,744,073,709,551,615 possible key values and only one of them is right.

The gives a binary fitness result of either 0 or 1. All a GA will do with a landscape like this is drift – any mutation that doesn’t produce the correct key value will result in a fitness of zero. Until the result is found (and the program terminates) all population members will have the same zero fitness value, all will reproduce with equal probability, so there will be no differential reproduction.  All you get is random drift – you might as well use a random search because in a search space like this a GA will do no better than just random sampling.

22. Mung – here is a fitness landscape for a 10 bit key (1024 possible values) – visualised in two dimensions.

00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000100000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000
00000000000000000000000000000000

What is the most effective method of navigating this fitness landscape?

• DrBot: What is the most effective method of navigating this fitness landscape?

Favorite Pet?

23. I think we should all encourage Mung in his endeavor. He is at least trying to understand what Lizzie and a few others here did. That’s pretty rare at UD.

Mung, you do need to modify the mutation function to switch a bit randomly  in either direction.

I am also not sure how your fitness is defined. Not knowing Ruby, I can’t parse this piece of code

# calculates the “fitness” of this chromosome
def fitness
score = 0
chromosome.scan(/1*/).each do |str|
score = str.length if str.length > score
end
score
end

• I entirely agree. It’s tempting to fight snark with snark, but there is a genuine profundity (IMO) at the heart of the ‘minimal’ GA, where there is only replication of replication: inevitable fixation of ancestry. (The reasons why this is ‘profound’ may not be immediately clear!)

The null is a very useful starting point, one all budding designers of organisms (in whose shoes you can put yourself, as God-of-the-GA) should be aware of. It’s hard to spot the null in ‘real’ organisms, because the little bastards keep on hopping about all over the place.

• olegt,

I am also not sure how your fitness is defined.

From an earlier comment:

Mung’s fitness function looks for the longest sequence of consecutive 1′s in the chromosome. The length of that longest sequence is the fitness value. That’s it. No kidding.

24. Mung: We’ll need to clarify what is meant by “landscape.”

A fitness landscape represents relative reproductive fitness. It can represent a real-life situation, such as protein function, or be an abstraction.

Mung: To me a fitness landscape isn’t something that is there waiting to be discovered (or climbed, ala Mount Improbable), it’s something that is created as populations evolve.

Certainly, real biological landscapes change, but static landscapes are often sufficient for particular studies, such as protein evolution. Keep in mind that even though the fitness landscape may be static in a simulation, the total environment changes due to competition with neighbors. In any case, it makes sense to start with static landscapes, but the basic behavior is often similar with dynamic landscapes.

Mung: The way I understand Zachriel’s argument is that he is appealing to a bag or assortment of pre-existing components (aka protein domains) that can be used in proteins and that their availability for use somehow lends less of a random character to the process (making a functional protein more likely) even though the main proposed mechanism for this shuffling is recombination, itself a random process and the protein domains themselves also arose largely as a result of a random process (perhaps “guided” by “natural selection”).

Not quite. The assumption is that splice and insertion points on existing replicators are random, not along functional divisions. Even then, it’s easy to show that the chance of successful results can be millions or billions of times greater than using randomized sequences.

Mung: Personally I have no conflict with regular repeated processes going on inside living organisms because to me that smacks of teleology.

There are many types of recombination (e.g. sexual), however, we’re concerned with random processes.

Mung: “In the first group this data structure is linear, and it’s length is fixed, that is, it does not change during a run of the algorithm.”

Yes, that’s a common type of genetic algorithm, but certainly not the only one, otherwise, you couldn’t simulate genomes that change size during evolution.

25. Mung’s comment reported above suggests that he WOULD have “problems” with things that DON’T “smack of teleology”

It seems to me to be an odd thing to have said, and quite revealing of a way of thinking that is not the most fruitful.

• The teleology would have to be of a pretty extreme nature, in the case of recombination. How do you actively get two combinationally useful genetic segments together, while still allowing their bearers free will? Unless randomisation is part of the design, for its novelty-generating capacity. In which case, blind and undirected processes can

26. olegt,

Mung, you do need to modify the mutation function to switch a bit randomly in either direction.

And there should be a mutation rate parameter that is independently applied to each bit. Currently, every call to mutate() sets exactly one bit in a genome, never more or less.

The corrected fitness function should work like this:

1. If the genome is all 0′s, return a fitness of 0.
2. Otherwise, initialize the fitness to 1.
3. Scan the genome for contiguous runs of 1′s.
4. For each run, let fitness = fitness * length of run.
5. Return fitness.

Mung has shared only the class definition, so I can’t comment on what changes are needed in the rest of his code.

• To make a child, you need to take each bit, one by one.  Compute a random number between zero and one. If the mutation rate exceeds this random number you apply the mutation.

To apply the mutation, compute a random integer between zero and one and this becomes the new bit. It may be the same as the old bit.

• Petrushka, I agree with your first paragraph, but not with the second:

To apply the mutation, compute a random integer between zero and one and this becomes the new bit. It may be the same as the old bit.

I think it’s better to always flip the bit when you apply the mutation. Otherwise it’s not a mutation at all, and the true mutation rate is lower than the parameter indicates.

On the other hand, you might argue that by sometimes leaving the bit unchanged, you are modeling neutral mutations. Even if that is your intent, however, the approach is still flawed because it fixes the ratio of neutral to non-neutral mutations at 1:1. Better to either make that ratio adjustable, or not to bother modeling neutral mutations at all.

• I think it’s better to always flip the bit when you apply the mutation. Otherwise it’s not a mutation at all, and the true mutation rate is lower than the parameter indicates.

Nope. A blind mutation can result in no change. What you are suggesting is not conceptually different from latching. It requires the mutation process to have some knowledge of the current state, which is forbidden.

• Assuming that “mutation rate” means an observed rate of actual change, the only “fair” way to define it it is to define the rate of mutation events to take the “no change” case into account.

Now, I could be wrong about how typical GAs work. I took my definition from Elsberry’s version of weasel. From my understanding, it is important for the mutation event not to know anything.

• Petrushka,

A blind mutation can result in no change.

A mutation is by definition a change in the genome. No change means no mutation. If it were otherwise, then every moment that a gene remained stable would count as a mutation, and the mutation rate would be astronomical.

What you are suggesting is not conceptually different from latching.

It’s the antithesis of latching. Any bit in any genome can change in any generation. Values are never locked in.

It requires the mutation process to have some knowledge of the current state, which is forbidden.

I think this may be the source of your confusion: in real life, we identify mutations after the fact by observing that the genome has changed. A mutational process doesn’t “know” the content of the genome before or after the change. It doesn’t even know that the genome has changed. It doesn’t possess the “forbidden knowledge” to which you refer.

By counting the number of genomic changes that we observe after a given period, we can estimate the mutation rate by dividing the number of changes by the time elapsed. (If we’re really anal, we can factor in the probability of mutations happening and then being reversed, but this can be neglected if the mutation rate per locus is sufficiently low).

In a model or a GA, things work backwards. We don’t identify mutations after the fact. Using the mutation rate, we decide beforehand whether a mutation should happen right now. If the answer is yes, then we change the genome. If we didn’t change the genome, it wouldn’t be a mutation — by definition.

In order to change the genome in the model, we have to know its current value. This isn’t “forbidden knowledge”. It’s essential knowledge, without which the model would be inaccurate.

The model is distinct from the thing being modeled. “Forbidden knowledge” in the real world is not the same as forbidden knowledge in the model.  It’s okay to use real-world “forbidden knowledge”  in order to make the model accurate.  What’s not okay is to model the use of real-world “forbidden knowledge”.

• Your analysis works on a binary code, but becomes more difficult as the number of possible values goes up. Is the mutation engine supposed to do a lookup to see what values are possible that will result in a change?

That seems not to model reality. It would seem more realistic simply to tweak the mutation rate so the change rate is what you want it to be.

Incidentally, I love being able to have a disagreement and discussion that isn’t stupid. What we have here is a dispute over arbitrary definitions.

In Weasel you have 26 possible values rather than two. That’s in the same ballpark as the number of amino acids. A mutation resulting from a quantum event has no way of knowing if it is making a change.

And the actual difference in effective rate is not that great.

I just think a blind event is conceptually cleaner, regardless of the numeric base.

• Incidentally, I love being able to have a disagreement and discussion that isn’t stupid.

Me too! It’s a real pleasure to disagree with an intelligent person who’s open to reasoned argumentation and actually tries to understand what I’m saying, and why.

I’ll respond to your other points later. Right now I’m out of time.

• My two cents, you can’t go far wrong if you mirror biology. DNA polymerase makes a mistake once every 10^7 to 10^8 bits. That would equate to a RAND threshold examined for every bit but which only causes a flip once every 10^7 or so bits copied. Once your mutational threshold has been breached, you must mutate (flip) – that’s what the mutational error is.

Polymerase ‘knows’ what the bit is when it does its job properly. It’s not so much that the mutation process has knowledge of the current state, but that you are simulating the situation where the accurate-copying process (which must know the current state) gets it wrong.

There are other mutational processes with subtly different characteristics, more akin to a lump of radioactive material. Atoms either decay (flip) or they don’t. Bits stay stable most of the time, and the mutational process does not need to look at them to see if they can stay stable. Stochastically, with a mean of so many events per time t per genome, a stable bit ‘decays’ – the mutational process zooms in on a bit at random, and flips it.

So there are two possibilities with biological merit – check every bit but flip every nth (stochastically) or zoom in on a random bit in a time-dependent manner and flip it. The first can occur during replication, the second at any time.

The first could be coded a bit more efficiently than checking a RAND for every bit. You know you are replicating everything that goes into the copy method, which gives you N x L bits (organisms x genome length), of which you want a stochastic 1 in 10^7 (say) to be incorrect. So just flip them, at random. You should never need to look at a bit you aren’t going to flip, in short.

How important this is depends on the application, of course. No-one’s going to write a problem-solving GA worrying too much about biologically accurate mutation rates and methods.

• petrushka: Is the mutation engine supposed to do a lookup to see what values are possible that will result in a change?

You’re not actually bouncing cosmic rays into the electronic memory, but modeling, in the abstract, biological mutation. In this case, we would want our results to match the definition of biological mutation, which is the change in a base.

Ironically, in Word Mutagenation, we didn’t bother as it wasn’t worth the trouble with 26 letters, as it didn’t affect the results significantly. With binary, it would slow the effective mutation rate by half.

Case 2 ‘ Point Mutation

i = Random(0, lengthWord – 1) ’ random position
j = Random(97, 122) ’ a – z
Mutation = Left(Word, i) & Chr(j) & Right(Word, lengthWord – i – 1)

27. Mung: A Question About CSI

I have a program that generates a random string of ASCII characters.

“y\x1EB.\x01UF\x1CLy)V(PHP\x04\rp=v~ i/pG\e_@\x0E\x06kf-FH\x00VBM]\ttLyu&eN\x1D>gA9*=o{cGV\x182ORh\x12uH56″

I can simply describe this string as “a program generated string.” Why doesn’t that demonstrate that I have “generated CSI”?

How’d you guess our PIN?

• What Mung has there is a 735-bit individual from Lizzie’s digital binary menagerie (assuming his program generates all possibilities from the ASCII set, including control characters).

There are 2^735 such creatures. They can simply be regarded as all the 735-bit binary numbers from all-zeroes to all-1′s. If you generate one at random, it may help to view it as actually picking the ordinal position of the string in that set, as much as the string itself. No position in the set is more ‘privileged’ that any other. You can easily map other base-n systems (eg DNA) onto this assumed underlying ‘binary’ set. Inside your computer, that’s pretty much what they all are anyway.

That’s the digital genotype. It’s what would be inherited, if you copied it. Whether it has ‘CSI’ or not depends entirely on how strings are evaluated – upon phenotype. This is related to the fitness function, but is not quite the same.

If your phenotype involves slicing it into 7-bit chunks and generating a unique bitmap corresponding to the ASCII character set, and the evaluation further depends upon a human operator recognising it as meaningful text, then obviously this would not stand out. If CSI depended upon the string looking like a bit of Bill Gates’s horse in photoshop, however, it could have CSI in one system but not the other. And if you sliced it another way and turned it into a segment of RNA with added promoters, it might become a useful fragment of protein.

The point is, that each of those evaluation methods gives a completely different structure to the distribution of ‘well-formed strings’ in the space. ‘CSI’ is context-dependent. And whatever it is, you can potentially generate strings with more of it by selection, mutation and recombination among strings with less.

If function was defined as everything with a zero in it, almost the entire space would have it. If it was defined by a run of 200 consecutive 0′s, there would be much less. But you can’t take a space evaluated under one rule (eg meaningful English) and assume that all spaces of the same size (eg protein or RNA) will be structured in the same way with regards to the distribution of ‘surprise’, or its navigability.

28. 158 Mung October 5, 2012 at 7:21 pm

If I were attempting to simulate biological evolution, I think you might have a valid point, though there seems to be some debate over there at TSZ on this point.

But the purpose of my “GA” is not to simulate biological evolution, but rather to simply illustrate how easy it is to generate CSI. Lizzie, imo, made her program unnecessarily complex. But she perhaps had a different goal in mind than I did.

If you think Lizzie’s program generates CSI, I would like to know why you think mine doesn’t.

Does the fact that my mutate function “latches” make a difference that makes a difference?

There are some interesting reasons why you might want to add that kind of randomness. However, I don’t want to spoil the joy of learning, so for now I will refrain from explaining why you need it. Instead, I would like to encourage you to finish the program and send it on its merry way to the top of the fitness landscape. Go ahead, it’s going to be fun.

29. 159 Mung October 5, 2012 at 8:08 pm

We are looking for patterns in the “chromosome.”

Each pattern in the chromosome that matches the regular expression (a one followed by one or more ones) is assigned to the variable str. (chromosome.scan(/1*/).each do |str|)

So then we are interested in how many contiguous “1″s are in the string that matched the pattern.

score = str.length if str.length > score

If we find a pattern of 1s that has more 1s than the previous number of ones we assign that vsalue to the score.

So “fitness” is decided based upon the number of contiguous 1′s that are found in the sequenec.

If I understand this correctly, your fitness function is the length of the longest string of ones in a given binary sequence. That does not match Lizzy’s fitness function.

But of greater interest to me is, why does my GA not model the same thing as Lizzie’s GA? Did I not generate CSI? Why not?

Your program has not run yet. It did not produce digital organisms from a suitably small target space.

30. Mung: “My code is object-oriented, her’s is barely even procedural. “

Why does that matter?

I don’t mean that in a “my favourite language is better than your favourite language” sort of way, I really mean that it doesn’t matter in any way.

Object oriented code was being written before C++ or SmallTalk or any other language that explicitly defined objects existed simply because experienced programmers started to use structures in that manner.

When I hear that code is in any way “better” because it explicitly uses formal classes, I cringe!

Mung: “Well written code is as descriptive as English text. “

Any “runnable” code is not going to be easy for everyone to figure out if they are not familiar with the syntax and operators of that language.

If you use pseudocode instead, everyone no matter what their language of choice is can understand it.

Mung:

# calculates the “fitness” of this chromosome
def fitness
score = 0
chromosome.scan(/1*/).each do |str|
score = str.length if str.length > score
end
score
end

Toronto:

functionProcessPopulation( PopulationSize, functionFitness )

{

variableBestFitness = 0;

for( PopulationSize– )

{functionFitness(&Members[PopulationSize],&variableBestFitness)}

// and so on

…………………………

That doesn’t run but gives an idea of what you want to do and allows everyone to give some constructive criticism without needing you to explain the nuances of Ruby.

Its even better to simply say in English what you want to do before coding and get feedback before you commit to anything.

31. Mung: “Object oriented code was being written before the concept of objects even existed. What a moron. “

It was obviously being written before you were born!

I’ve used structures in assembler to map both data and pointers to functions.

Sound familiar?

Yup, it’s an object definition!

Every time I need to instantiate an object, I just point at some memory and use the structure as a template for the object.

32. Mung: “Because ignorance is no excuse. “

I don’t understand why your side is so emotional all the time!

I started working on C++ in 1988 and discovered an amazing thing.

This language that I thought would finally help bad programmers do a better job did exactly the opposite!

It gave them the ability to “virtually” hide mistakes that affected the whole system.

33. Mung: “No explanation of why “pseudo code” is any different from “runnable” code is offered. “

Mung, do you think pseudocode will compile?

Seriously though, there is a part of this sentence you don’t understand and I want to help.

“If you use pseudocode instead, everyone no matter what their language of choice is can understand it.”

Everyone, (that means “all people”), can understand, (that means they don’t have to keep asking “why”), no matter what their language of choice is, ( that means even if they use a computer language other than yours).

The side benefit of pseudocode is that it never fails to compile because its purpose is to illustrate concepts, not to generate code.

The fact that you expected an explanation for pseudocode amazes me.

34. Toronto: Its even better to simply say in English what you want to do before coding and get feedback before you commit to anything.

Mung: “Working code trumps whatever fantasy world you’re in. “

Let me guess; you generate a “specification” for your code after the “functionality” has been observed.

I bet you have no bugs, just “features”!

At least you’re consistent with ID’s concept of “specified function” in that whatever you end up with must be what you wanted.

Bad engineering though as kairosfocus will agree with me.

• How quickly distractive red herrings obscure the path of truth!

Joe F tries to illustrate an evolutionary principle to GP using a simplified GA, and many moons later we are discussing the relevance of pseudocode as a helpful intermediary on the requirements -> coding path, and the extent to which more ‘traditional’ structural elements map onto objects!

OK, I’ll join in – you’ll get more quickly and reliably to working code if you pseudocode it first, and desk-run it. And if you are speccing it up for someone else to evaluate or code, formalising the logic and structure removes some of the ambiguities of an English spec, while avoiding the excessive detail and technical arcanities of actual code. Just simple, almost-BASIC stuff – If-else-endif, For-each end-for, While end-while.

35. Mung:

No explanation of why “pseudo code” is more understandable is offered.

No explanation of why “pseudo code” is any different from “runnable” code is offered.

The fact that you expected an explanation for pseudocode amazes me.

And me. If Mung spent just half as much energy learning as he does flaming people, he wouldn’t have to ask questions like that.

36. Is the mutation engine supposed to do a lookup to see what values are possible that will result in a change?

It could, but it doesn’t have to. Here’s another way to do it: Suppose each locus contains one out of N symbols which are represented by the integers 0 through N-1. To mutate a locus, just generate a random number between 1 and N-1 (inclusive). Add the random number to the existing symbol, modulo N, and you have your new symbol for that locus.

What we have here is a dispute over arbitrary definitions.

Yes, in the sense that any definition is arbitrary. However, the terms in question have an established usage in biology. A mutation in biology, as the name implies, is a change in the genome. The mutation rate is the number of these changes divided by the elapsed time. I think we minimize confusion by using these established definitions when talking about our models.

A mutation resulting from a quantum event has no way of knowing if it is making a change.

You still seem uneasy about the idea of modeling a mutation by “reading” the current value at a locus and replacing it with a different value. Here’s a way to convince yourself that this is legitimate. Imagine we have a live organism alongside a model of that same organism. The model uses my proposed method of modeling mutations.

Both the organism and the model start out in the same state (let’s call it the “before” state). Now imagine that a mutation occurs in the organism, and a corresponding mutation occurs in the model. If the model did something illegitimate by looking at the “before” value at the locus, then the “after” states of the organism and the model should be different. But if you work it out, they’re not. The states remain exactly the same, and the model hasn’t gained any advantage over the real organism by looking at the “before” value of the locus in question.

And the actual difference in effective rate is not that great.

Using your approach, the effective mutation rate is 50% lower than the programmed rate when the loci are bits, as in Lizzie’s program. That seems pretty significant to me.

• I’d agree – you are using a process that cannot make a mistake to simulate one that can. The randomness is in isolating the bit you are going to make your ‘accidentally-on-purpose’ mistake on. Once you have your bit, it must mutate. You can simply add 1 to it for binary flips, but even then, to do that the OS must ‘know’ what the bit is set to, even if you don’t check its value in your code.

For more bit-types, you could model a probability distribution for the various errors and ‘pick’ one based upon this. You can’t do transition-transversion bias, for example, without looking at bits.

37. Mung, you said: “Darwinian evolution does not need “the right fitness landscape” to work. (What would a “wrong” fitness landscape look like?)”

I replied: “Think about what a fitness landscape for a 64 bit encryption key would look like – you have 18,446,744,073,709,551,615 possible key values and only one of them is right.”

You said: “I’m sorry, but that question doesn’t even make sense to me. 64 bit encryption keys don’t exhibit differential reproduction, do they? In what sense is one 64 bit encryption key more fit than another 64 bit encryption key?”

Exactly – that is my point, that is what a “wrong” fitness landscape looks like. Perhaps I should walk you slowly through it again.

Mung, you said: “Darwinian evolution does not need “the right fitness landscape” to work. (What would a “wrong” fitness landscape look like?)”

A GA can be used to search any problem space for potential solutions. Your code example searches for the longest string of 1′s in a binary string, my example problem also involves a search for binary string. In your system there is a proportional relationship between the bit string and the result of fitness evaluation, in my example there is a binary relationship between the bit string and the fitness evaluation when we use a GA to search for the key.

This demonstrates why some problem spaces, and their resulting fitness landscapes (or if you prefer, error landscapes) are not amenable to an evolutionary search.

Mung: “Why are you calling it a fitness landscape? Fitness is a measure of differential reproduction. So, context.”

It is called a fitness landscape when a GA is used as the search algorithm. We are discussing GA’s and you had asked “What would a “wrong” fitness landscape look like?” I gave an example of the fitness landscape you get when applying a GA to a problem for which it is unsuitable.

Or to put it another way – If you apply a GA to some problems, like finding an encryption key, you don’t get differential reproduction (and it should be obvious why when you look at the fitness landscape that the GA has to navigate)

38. Just been reading a series of posts from Mung:

keiths:

His ‘mutate’ method sets a random bit in the chromosome to 1. It never sets bits to 0. That’s right — Mung’s program latches!

Mung:

So? You think the results will be different if I mutate to either 0 or 1? Will that prevent me from generating CSI, like Lizzie?

News Flash! Latching prevents algorithmic CSI generation!

I would think it encourages it, but what do I know. =P

keiths:

Mung’s fitness function looks for the longest sequence of consecutive 1?s in the chromosome. The length of that longest sequence is the fitness value.

Mung:

So? Will that prevent me from generating CSI, like Lizzie?

Why can’t fitness be judged according to “the longest sequence of contiguous 1′s”? You don’t say.

There’s absolutely nothing to prevent fitness being defined in that way. But it becomes trivial – it is pretty clear, without even running the thing, that a program that

1) rewards the longest string of H’s and

2) makes T’s mutatable but not H’s

is going to throw up a string of all-H’s in pretty short order.

The second is the problem; it’s biologically unrealistic, and was the criticism levelled at Weasel.

So just allow a more realistic mutation function, knocking genomes downhill as well as up (which is not actually always a bad thing, and is why more rugged and dynamic landscapes are more interesting) and apply the ‘product-of-runs-of-heads’ fitness criterion. You wonder if anyone but you understands Lizzie’s algorithm, but you seem curiously unwilling to even replicate it, in order to see the relevance of the various subroutines. Just follow the spec! (That has a familiar ring).

• Better yet, change them one at a time. Mutation first. See what impact this has with a fitness function that relies on uninterrupted stretches of sequence where every bit counts. Which is also biologically unrealistic, but less importantly so for GA searching.

39. 163 Mung October 5, 2012 at 9:25 pm

I never asserted that my fitness function matched Lizzy’s fitness function in every detail. What’s wrong with my fitness function?

Why does Lizzy’s fitness function lead to the generation of CSI while mine does not?

Lizzie claims her fitness function maximizes something. So does mine .

Just trying to understand what your program is supposed to do, Mung. I was under the mistaken impression that it was similar to Lizzy’s. Now I see that it is not.

A small nit to pick. A fitness function does not maximize something. It is merely a quantitative measure. A selection procedure could maximize a fitness function. I might comment on your program later (have errands to run), but I largely agree with Allan Miller above.

You have a large enough search space (500 bits) and a small enough target space (a sequence of all ones). Your organisms are guaranteed to get there as the fitness landscape is smooth all the way up. Your algorithm will get there in a few thousand steps, doing much better than a blind search, which will take forever to find the sequence of all ones.

Does this produce CSI? Certainly the S and the I. Your friends at UD will complain that the C is missing: we know the solution in advance and there is nothing complex about that. More on that later.

40. Mung: I have a program that generates a random string of ASCII characters. I can simply describe this string as “a program generated string.” Why doesn’t that demonstrate that I have “generated CSI”?

Any number of random strings meet the description. While the complexity is high, the specification is low based on that description.

Zachriel: How’d you guess our PIN?

Good! But there’s a point there. By your description, the sequence has no specificity. But by our description, it’s unique, and therefore is high in specified complexity.

Of course, what are the chances you guessed our PIN? This is an example of what Dembski would call a post-specification. Nevertheless, it shows how the calculation of specified complexity changes depending on our understanding.

Mung: So if the program were to produce a string of 72 characters that match precisely the first 72 characters of this post, it would be reasonable to make a design inference.

The program code says it’s random, so you would actually have a conundrum. It could just as easily be a programming error, such as not resetting the randomizer seed.

Mung: Why are you calling it a fitness landscape? Fitness is a measure of differential reproduction. So, context.

That’s right. In this case, all sequences have fitness at the floor, except for one.

Mung: But the purpose of my “GA” is not to simulate biological evolution, but rather to simply illustrate how easy it is to generate CSI.

Yes, depending on the definition of complexity. It is very easy, especially on very simple landscapes. Your landscape, as we understand it, is a simple hill. Evolutionary algorithms quickly climb hills. All roads lead to the top.

But this also shows the problematic nature of specified complexity. If we define the specificity as the highest possible fitness, there is only one sequence that qualifies, and that makes it highly improbable in the Dembskian sense. On the other hand, if we define fitness as 90% of the highest possible fitness, then the specified complexity is much less.

Mung: Oh, so a fitness landscape is a representation?

Yes. Like any simulation.

Mung: It represents relative reproductive fitness?

Yes, that is what it represents. We could imagine a library which has the relative fitness of every possible genome, though due to technological limitations, we either substitute a smaller library (a landscape map), or utilize an abstraction.

Mung: Is protein function synonymous with reproductive rate?

It certainly can be, as has been experimentally demonstrated. However, as far as specified complexity is concerned, that’s unimportant.

• Mung: Is protein function synonymous with reproductive rate?

Zachriel: It certainly can be, as has been experimentally demonstrated.

I’d go for a more definitive ‘no’. Breaking proteins (rendering them nonfunctional in engineering or chemical terms), can increase reproductive rate. One imagines the fitness landscape of a particular peptide sequence as being the set of all sequences of that length, each sitting on a contour which represents the fitness of that sequence in an assumed static context. ‘Function’ would require a different mapping – colour, for example, with shading to represent yet another dimension – the degree of that activity. Colour and shading would not necessarily align with the rise and fall of the landscape.

• Well, synonymous is too strong a word, as you point out, but the contours are often isomorphic within the domain of interest.

41. Just trying to understand what your program is supposed to do, Mung. I was under the mistaken impression that it was similar to Lizzy’s.

That’s because Mung said it was:

Mung World

ok, so I created my own version of Lizzie’s program.

took less than 10 seconds
1522 generations

What’s the big deal?

• Ouch! That doesn’t sound right.

And in the next comment, Mung ads more nonsense:

Lizzie has encoded a potential solution to her problem in each member of her starting population. It matters not that they were randomly generated.

She didn’t of course. These were generated as random sequences.

1.) if we change her encoding her program will not work. It depends upon sequences of 0′s and 1′s. Just try changing that and see what happens.

Again, the initial sequences were random. Any sequence will do. Mung can supply his own initial sequences if he wants to. The program will run in the same way, unless he supplies one of the solutions, of course.

2.) There is at least the possibility that a solution will be found among the first 100 randomly generated genomes, though she doesn’t actually check to see if that is the case.

And the probability of that would be one in 10 to the power of 150 or something like that. So what?

Sorry, Mung, but from this it is safe to conclude that you don’t understand what Lizzie did. At least not when you wrote that. Maybe you do now, but I have seen no evidence of that.

• I can dig out my own code that I wrote when we were discussing Lizzie’s program. There should be no difference whatsoever even if one starts with all zeroes. The landscape is pretty smooth at the bottom.

• Don’t bother.  Everyone but Mung understands that there won’t be any significant difference.

He needs to do the experiment himself in order for the lesson to sink in.

• I’m not following this. I don’t know which pseudocode you are referring to or what the program is attempting to do.

42. If anyone wants to know why ID/creationists can’t understand genetic algorithms, here are all the reasons (and emotions) in this recent ID/creationist video in which numerous concepts are conflated and mangled beyond recognition.

Compare this with Thomas Kindell’s sneering presentation; or with Werner Gitt’s “In the Beginning was Information.”

This is why Salvador T. Cordova – one of their own, no less – was so brutally rebuffed over at UD when he had the temerity to suggest that Sewell was wrong.

43. Toronto:

Mung: “I didn’t try to make a game of it. “

Funny!

He’s unusually quiet today. The weather must be crappy in

Mung World

44. 181 Mung October 7, 2012 at 1:13 pm

There seems to be some misunderstanding about my program. It’s fundamental purpose is to show how easy it is to generate CSI using a simple algorithm. Isn’t that what Lizzie program is designed to show as well?

IMO, Lizzie’s program is needlessly complex and takes too long to achieve the desired result.

The fundamental question that needs to be asked and answered is, why does her program generate CSI while mine does not?

As I have already said, the target of your program is a sequence of all ones. My suggestion would be to ask your friends at UD whether it qualifies as complex. My hunch is that they might not agree.

In contrast, Lizzie’s target space features sequences that are more complex. She did not know in advance what they would be, although with some effort one can figure that out.

In any event, complexity is in the eye of the beholder. By some measures (e.g., Kolmogorov’s), your sequence is quite simple: it can be concisely described as “500 ones in a row.” I don’t think a typical solution from Lizzie’s target space can be described in a similarly concise way. On the other hand, your UD friends may not find these solutions particularly complex. If that happens, we get evidence that complexity (and thus CSI) has a subjective component.

• One could make increasing Kolmogorov, or Shannon, information the fitness function. Whether that landscape is searchable by a GA, I don’t know, but it would be of interest.

• I doubt that it is possible, especially in the case of the Kolmogorov complexity. How would an algorithm figure out the shortest possible description of a string? The Kolmogorov complexity is a mathematical abstraction rather than a practical tool.

For Shannon information, one would have to treat the string as a stream of bits and determine their predictability by using this finite window. I suppose one could use some standard compression algorithm as a proxy and make the size of the compressed file as a fitness function.

Either way, it would be easy to get strings that have very high fitness by using a good generator of random numbers.

• Yeah, I guess I didn’t think that through!

It does raise the question why people would think any informatic measure – including CSI – is appropriate to treatment of digital biological strings in isolation. The unaddressed first-level layer for proteins is folding – a property deriving from all the atoms and bonds in the string, and their constraint upon it adopting a particular lowest-energy configuration in a ‘reasonable’ time ‘sufficiently’ often.

Obsessing over primary sequence emphasises the peptide bond, as the preserver of that primary sequence, but ignores all other interactions. Proteins are made sequentially, but they certainly don’t fold sequentially (even though they begin folding as they are being extruded, a whole peptide can be stretched out and will then ‘twang’ back into its favoured configuration).

Holding key atoms in space is not easily deducible from primary sequence, and many equivalent ‘functional’ neighbours can derive from very distant sequence-neighbours, likewise close sequence-neighbours can flip between very different functional regions. Assignment of the label CSI, and dismissing the ability of ‘evolutionary’ search (selectively neutral and non-neutral reproduction + RV) to find it needs to take this into account.

45. Petrushka,
It’s an implementation of the coin toss game, aka Lizzie’s Experiment  as detailed in the original post.

Imagine a coin-tossing game. On each turn, players toss a fair coin 500 times. As they do so, they record all runs of heads, so that if they toss H T T H H H T H T T H H H H T T T, they will record: 1, 3, 1, 4, representing the number of heads in each run.

What you see on the screen are 100 runs of 500 coin tosses and the product of those scores, each minute another round is played where the bottom 50% are overwritten by mutated versions of the top 50%.

• I wasn’t sure whose program was being emulated. I’m a bit concerned about the pace. Lizzy’s program ran for millions of generations, I believe.

Also, the graphics prevent me from viewing the output on a tablet.

I believe the justification for calling it CSI is that the set of strings satisfying the halting criterion is sufficiently sparce as to satisfy Dembski’s probability bound.

• I’ll make a reduced version that’ll better work on, well, almost everything. I can, and will, speed it up but it was intended just as a visual aid really I’ll patch it up as comments note on the atbc thread linked.

46. Mung:

There seems to be some misunderstanding about my program. It’s fundamental purpose is to show how easy it is to generate CSI using a simple algorithm. Isn’t that what Lizzie program is designed to show as well?

The purpose of Lizzie’s program is to show that a Darwinian process can generate CSI without using the fitness function to “smuggle in” information about the form of a solution.

The fundamental question that needs to be asked and answered is, why does her program generate CSI while mine does not?

I don’t think you’ve told us explicitly what the target of your program is. If olegt’s surmise is correct and your target is a sequence of all 1′s, then your program does generate CSI: it finds a single specified target pattern out of a search space containing 2500 patterns.

The problem is not that your program doesn’t generate CSI. It’s that an ID proponent might try to argue that because your fitness function rewards long runs of 1′s, it is in effect telling the program “make the runs longer.” In other words, the IDer might argue that your fitness function is delivering too much information and telling the program how to generate a solution.

I don’t think that it is, but Lizzie skirts that problem entirely by using a fitness function that clearly does not tell the program anything about the form of the solutions. A human can look at your fitness function and instantly see that longer run lengths will increase fitness. It’s obvious. The same is not true of Lizzie’s fitness function. If you increase a particular run length in Lizzie’s program, you may end up decreasing another.  Will the fitness increase?  It depends.  It’s not immediately obvious to a human.  The mutation engine certainly doesn’t figure it out and take advantage of the knowledge.

Also, your complaint about the runtime of Lizzie’s program is easily addressed. I wrote a C program that uses her fitness function with a parameterized population size and mutation rate. It achieves a solution in less than a minute.

47. Normal
0

false
false
false

EN-GB
X-NONE
X-NONE

MicrosoftInternetExplorer4

gpuccio:

So, Szostack could easily engineer a protein with a strong binding to ATP (however useless in any biological context) because he knew what he wanted (an ATP binding protein), he measured and selected that function at very trivial levels in random sequences, he amplified, mutated, and intelligently selected the resulting sequences for that function. Good design, and very bad interpretation of the results, still echoed by yourself for bad reasoning.

So he caused the mutation?

But if the organism existed in the wild, and changing environmental conditions favoured a stronger binding to ATP then either the mutations could never happen, or if it did then they would not result in a survival advantage – even if conditions favoured those mutations?

algorithms are not conscious. They have no experience of purpose. Therefore, they cannot recognize function, unless in their code something has alredy been defined as “functional”.

Why do they need to be – A GA that rewards the efficient progress of a robot across a landscape does not need to recognise the purpose and function of a neural oscillator in generating leg motion, or understand how the distribution of mass in a limb affects gait efficiency. More efficient walkers have more offspring – that is all the reproductive element of such a GA does, and that is the extent of its ‘knowledge’.

So, Lizzie’s algorithm can compute answers to the question that is already embedded in it: it can do nothing else. My pi computing algorithm can compute pi: it can do nothing else.

Is the physical morphology and neural architecture of the robot embedded within a GA whose fitness function consists of the sum of two variables – Distance walked and the inverse of energy consumed?

Bear in mind that the GA doesn’t know what these two variables represent – It doesn’t even know that there are two variables – it just gets a number.

In some algorithms, the function can be defined more generically, so that they will be more flexible in their performance. But a new function, that is not covered by the definitions embedded in the algorithm, will never be recognized by the algorithm, and therefore no dFSCI related to that new function will ever be computed by the algorithm, because the algorithm cannot recognize that function.

The most generic fitness function for a GA would be “number of offspring produced”. For the walking robot it is “number of offspring produced is proportional to efficient locomotion”. I would quibble about calling this ‘function’ as you do – what is specified here is a behavioural outcome. The functions that produce those outcomes are not defined.

Of course the robot will never fly – or will it? – If the simulated environment does not explicitly forbid flight, and flight offers a reproductive advantage then you might well get flight. – If fact you get some surprises, I’ve seen walking robots that evolved to fly in simulations when the GA found and exploited an unknown flaw in the simulation.

Try taking a good look at Carl Sims work here:http://www.karlsims.com/evolved-virtual-creatures.html

The physical form and neural controllers of these ‘creatures’ are generated at random, the fitness function is, in most of the cases just “distance travelled in its lifetime” The functional parts of the creatures that evolved effective locomotion are not specified in the fitness function – it only specifies the behavioural outcome.

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:”Table Normal”;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:”";
mso-para-margin-top:0cm;
mso-para-margin-right:0cm;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0cm;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:”Calibri”,”sans-serif”;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:”Times New Roman”;
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;}