Evo-Info 3: Evolution is not search

Introduction to Evolutionary Informatics, by Robert J. Marks II, the “Charles Darwin of Intelligent Design”; William A. Dembski, the “Isaac Newton of Information Theory”; and Winston Ewert, the “Charles Ingram of Active Information.” World Scientific, 332 pages.
Classification: Engineering mathematics. Engineering analysis. (TA347)
Subjects: Evolutionary computation. Information technology–Mathematics.

Marks, Dembski, and Ewert open Chapter 3 by stating the central fallacy of evolutionary informatics: “Evolution is often modeled by as [sic] a search process.” The long and the short of it is that they do not understand the models, and consequently mistake what a modeler does for what an engineer might do when searching for a solution to a given problem. What I hope to convey in this post, primarily by means of graphics, is that fine-tuning a model of evolution, and thereby obtaining an evolutionary process in which a maximally fit individual emerges rapidly, is nothing like informing evolution to search for the best solution to a problem. We consider, specifically, a simulation model presented by Christian apologist David Glass in a paper challenging evolutionary gradualism à la Dawkins. The behavior on exhibit below is qualitatively similar to that of various biological models of evolution.

Animation 1. Parental populations in the first 2000 generations of a run of the Glass model, with parameters (mutation rate .005, population size 500) tuned to speed the first occurrence of maximum fitness (1857 generations, on average), are shown in orange. Offspring are generated in pairs by recombination and mutation of heritable traits of randomly mated parents. The fitness of an individual in the parental population is, loosely, the number of pairs of offspring it is expected to leave. In each generation, the parental population is replaced by surviving offspring. Which of the offspring die is arbitrary. When the model is modified to begin with a maximally fit population, the long-term regime of the resulting process (blue) is the same as for the original process. Rather than seek out maximum fitness, the two evolutionary processes settle into statistical equilibrium.

Figure 1. The two bar charts, orange (Glass model) and blue (modified Glass model), are the mean frequencies of fitnesses in the parental populations of the 998,000 generations following the 2,000 shown in Animation 1. The mean frequency distributions approximate the equilibrium distribution to which the evolutionary processes converge. In both cases, the mean and standard deviation of the fitnesses are 39.5 and 2.84, respectively, and the average frequency of fitness 50 is 0.0034. Maximum fitness occurs in only 1 of 295 generations, on average.

I should explain immediately that an individual organism is characterized by 50 heritable traits. For each trait, there are several variants. Some variants contribute 1 to the average number offspring pairs left by individuals possessing them, and other variants contribute 0. The expected number of offspring pairs, or fitness, for an individual in the parental population is roughly the sum of the 0-1 contributions of its 50 traits. That is, fitness ranges from 0 to 50. It is irrelevant to the model what the traits and their variants actually are. In other words, there is no target type of organism specified independently of the evolutionary process. Note the circularity in saying that evolution searches for heritable traits that contribute to the propensity to leave offspring, whatever those traits might be.

The two evolutionary processes displayed above are identical, apart from their initial populations, and are statistically equivalent over the long term. Thus a general account of what occurs in one of them must apply to both of them. Surely you are not going to tell me that a search for the “target” of maximum fitness, when placed smack dab on the target, rushes away from the target, and subsequently finds it once in a blue moon. Hopefully you will allow that the occurrence of maximum fitness in an evolutionary process is an event of interest to us, not an event that evolution seeks to produce. Again, fitness is not the purpose of evolution, but instead the propensity of a type of organism to leave offspring. So why is it that, when the population is initially full of maximally fit individuals, the population does not stay that way indefinitely? In each generation, the parental population is replaced with surviving offspring, some of which are different in type (heritable traits) from their parents. The variety in offspring is due to recombination and mutation of parental traits. Even as the failure of parents to leave perfect copies of themselves contributes to the decrease of fitness in the blue process, it contributes also to the increase of fitness in the orange process.

Both of the evolutionary processes in Animation 1 settle into statistical equilibrium. That is, the effects of factors like differential reproduction and mutation on the frequencies of fitnesses in the population gradually come into balance. As the number of generations goes to infinity, the average frequencies of fitnesses cease to change (see “Wright, Fisher, and the Weasel,” by Joe Felsenstein). More precisely, the evolutionary processes converge to an equilibrium distribution, shown in Figure 1. This does not mean that the processes enter a state in which the frequencies of fitnesses in the population stay the same from one generation to the next. The equilibrium distribution is the underlying change­less­ness in a ceaselessly changing population. It is what your eyes would make of the flicker if I were to increase the frame rate of the animation, and show you a million generations in a minute.

Animation 2. As the mutation rate increases, the equilibrium distribution shifts from right to left, which is to say that the long-term mean fitness of the parental population decreases. The variance of the fitnesses (spread of the equilibrium distribution) increases until the mean reaches an intermediate value, and then decreases. Note that the fine-tuned mutation rate .005 ≈ 10–2.3 in Figure 1.

Let’s forget about the blue process now, and consider how the orange (randomly initialized) process settles into statistical equilibrium, moving from left to right in Animation 1. The mutation rate determines

  1. the location and the spread of the equilibrium distribution, and also
  2. the speed of convergence to the equilibrium distribution.

Animation 2 makes the first point clear. In visual terms, an effect of increasing the mutation rate is to move equilibrium distribution from right to left, placing it closer to the distribution of the initial population. The second point is intuitive: the closer the equilibrium distribution is to the frequency distribution of the initial population, the faster the evolutionary process “gets there.” Not only does the evolutionary process have “less far to go” to reach equilibrium, when the mutation rate is higher, but the frequency distribution of fitnesses changes faster. Animation 3 allows you to see the differences in rate of convergence to the equilibrium distribution for evolutionary processes with different mutation rates.

Animation 3. Shown are runs of the Glass model with mutation rate we have focused upon, .005, doubled and halved. That is,  = 2 ⨉ .005 = .01 for the blue process, and  = 1/2 ⨉ .005 = .0025 for the orange process.

An increase in mutation rate speeds convergence to the equilibrium distribution, and reduces the mean frequency of maximum fitness.

I have selected a mutation rate that strikes an optimal balance between the time it takes for the evolutionary process to settle into equilibrium, and the time it takes for maximum fitness to occur when the process is at (or near) equilibrium. With the mutation rate set to .005, the average wait for the first occurrence of maximum fitness, in 1001 runs of the Glass model, is 1857 generations. Over the long term, maximum fitness occurs in about 1 of 295 generations. Although it’s not entirely accurate, it’s not too terribly wrong to think in terms of waiting an average of 1562 generations for the evolutionary process to reach equilibrium, and then waiting an average of 295 generations for a maximally fit individual to emerge. Increasing the mutation rate will decrease the first wait, but the decrease will be more than offset by an increase in the second wait.

Figure 2. Regarding Glass’s algorithm (“Parameter Dependence in Cumulative Selection,” Section 3) as a problem solver, the optimal mutation rate is inversely related to the squared string length (compare to his Figure 3). We focus on the case of string length (number of heritable traits) L = 50, population size N = 500, and mutation rate  = .005, with scaled mutation rate uʹ L2 = 12.5 ≈ 23.64. The actual rate of mutation, commonly denoted u, is 26/27 times the rate reported by Glass. Note that each point on a curve corresponds to an evolutionary process. Setting the parameters does not inform the evolutionary search, as Marks et al. would have you believe, but instead defines an evolutionary process.

Figure 2 provides another perspective on the point at which changes in the two waiting times balance. In each curve, going from left to right, the mutation rate is increasing, the mean fitness at equilibrium is decreasing, and the speed of convergence to the equilibrium distribution is increasing. The middle curve (L = 50) in the middle pane (N = 500) corresponds to Animation 2. As we slide down the curve from the left, the equilibrium distribution in the animation moves to the left. The knee of the curve is the point where the increase in speed of convergence no longer offsets the increase in expected wait for maximum fitness to occur when the process is near equilibrium. The equilibrium distribution at that point is the one shown in Figure 1. Continuing along the curve, we now climb steeply. And it’s easy to see why, looking again at Figure 1. A small shift of the equilibrium distribution to the left, corresponding to a slight increase in mutation rate, greatly reduces the (already low) incidence of maximum fitness. This brings us to an important question, which I’m going to punt into the comments section: why would a biologist care about the expected wait for the first appearance of a type of organism that appears rarely?

You will not make sense of what you’ve seen if you cling to the misconception that evolution searches for the “target” of maximally fit organisms, and that I must have informed the search where to look. What I actually did, by fine-tuning the parameters of the Glass model, was to determine the location and the shape of the equilibrium distribution. For the mutation rate that I selected, the long-term average fitness of the population is only 79 percent of the maximum. So I did not inform the evolutionary process to seek out individuals of maximum fitness. I selected a process that settles far away from the maximum, but not too far away to suit my purpose, which is to observe maximum fitness rapidly. If my objective were to observe maximum fitness often, then I would reduce the mutation rate, and expect to wait longer for the evolutionary process to settle into equilibrium. In any case, my purpose for selecting a process is not the purpose of the process itself. All that the evolutionary process “does” is to settle into statistical equilibrium.

Sanity check of some claims in the book

Unfortunately, the most important thing to know about the Glass model is something that cannot be expressed in pictures: fitness has nothing to do with an objective specified independently of the evolutionary process. Which variants of traits contribute 1 to fitness, and which contribute 0, is irrelevant. The fact of the matter is that I ignore traits entirely in my implementation of the model, and keep track of 1s and 0s instead. Yet I have replicated Glass’s results. You cannot argue that I’ve informed the computer to search for a solution to a given problem when the solution simply does not exist within my program.

Let’s quickly test some assertions by Marks et al. (emphasis added by me) against the reality of the Glass model.

There have been numerous models proposed for Darwinian evolution. […] We show repeatedly that the proposed models all require inclusion of significant knowledge about the problem being solved. If a goal of a model is specified in advance, that’s not Darwinian evolution: it’s intelligent design. So ironically, these models of evolution purported to demonstrate Darwinian evolution necessitate an intelligent designer.

Chapter 1, “Introduction”


[T]he fundamentals of evolutionary models offered by Darwinists and those used by engineers and computer scientists are the same. There is always a teleological goal imposed by an omnipotent programmer, a fitness associated with the goal, a source of active information …, and stochastic updates.

Chapter 6, “Analysis of Some Biologically Motivated Evolutionary Models”


Evolution is often modeled by as [sic] a search process. Mutation, survival of the fittest and repopulation are the components of evolutionary search. Evolutionary search computer programs used by computer scientists for design are typically teleological — they have a goal in mind. This is a significant departure from the off-heard [sic] claim that Darwinian evolution has no goal in mind.

Chapter 3, “Design Search in Evolution and the Requirement of Intelligence”

My implementation of the Glass model tracks only fitnesses, not associated traits, so there cannot be a goal or problem specified independently of the evolutionary process.

Evolutionary models to date point strongly to the necessity of design. Indeed, all current models of evolution require information from an external designer in order to work. All current evolutionary models simply do not work without tapping into an external information source.

Preface to Introduction to Evolutionary Informatics


The sources of information in the fundamental Darwinian evolutionary model include (1) a large population of agents, (2) beneficial mutation, (3) survival of the fittest and (4) initialization.

Chapter 5, “Conservation of Information in Computer Search”

The enumerated items are attributes of an evolutionary process. Change the attributes, and you do not inform the process to search, but instead define a different process. Fitness is the probabilistic propensity of a type of organism to leave offspring, not search guidance coming from an “external information source.” The components of evolution in the Glass model are differential reproduction of individuals as a consequence of their differences in heritable traits, variety in the heritable traits of offspring resulting from recombination and mutation of parental traits, and a greater number of offspring than available resources permit to survive and reproduce. That, and nothing you will find in Introduction to Evolutionary Informatics, is a fundamental Darwinian account.

1,439 thoughts on “Evo-Info 3: Evolution is not search

  1. dazz,

    How the hell do you conclude that there should be tons of fossils everywhere?

    Yeah, I blinked, opened the combox then thought ‘screw it’.

  2. J-Mac: Some thought for the longest time that Einstein was never going to be proven wrong…therefore considered him a god…

    If you never make a claim ,therefore you are considered a God?

    He was proven wrong about entangled particles “communicating” faster than the speed of light, cosmological constant, time in relation to theory of relativity at least in some aspects…etc…though he did say more than once time was an illusion…I have to agree…

    I guess “some” were wrong as well.

  3. dazz: How the hell do you conclude from the above that there should be tons of fossils everywhere?

    Tons of thousands. Can’t you read?

  4. dazz: How the hell do you conclude from the above that there should be tons of fossils everywhere?

    You’ve never provided one piece of evidence that 10 billion species on earth are evolving into other species…

    Now you are saying they don’t leave any fossils behind either?

    What do you base your assumptions on then?
    “To be honest” … it seems like another mystery of evolution to me or more precisely miraclevolution

  5. phoodoo: If math doesn’t equate to reality, why must one bother to understand the math?

    When we express observations in mathematical language, our observations are not reality itself, and our mathematical expressions of our observations are not reality itself. There might be something wrong with our observational apparatus, and there might be something wrong with what we say about — our expressions of — the observations made with the apparatus.

    We use mathematics to reason formally about the expressions, not the observations, and not whatever it is in reality that gave rise to the observations. If we can prove that the expressions are logically consistent with one another, then new expressions that we derive, as logical consequences of the expressions we began with, sometimes can be interpreted as expressions of observations yet to be made, i.e., as predictions, or as characterizations of the observations implicit in the original expressions.

    If you haven’t bothered to understand the math, then you cannot judge in the beginning whether the mathematical expressions indeed describe the observations, and you cannot judge in the end whether the initial expressions indeed entail the new expressions. Nor can you judge what the new expressions say about observations we have yet to make.

  6. J-Mac: You’ve never provided one piece of evidence that 10 billion species on earth are evolving into other species…

    It simply follows from the tenets of evolution. If all species evolved from a common ancestor, every step in the way was a transition to the present state, and all living forms were transitional forms. There’s no reason to believe evolutionary processes came to a halt just today, so it follows that all extant species are also transitioning.

    J-Mac: Now you are saying they don’t leave any fossils behind either?

    Of course not. We have plenty fossils… and they’re all transitional too

  7. dazz: We have plenty fossils… and they’re all transitional too

    They are all dead. Transitioning to dust.

    Fossil to fossil, dust to dust.

  8. dazz: It simply follows from the tenets of evolution. If all species evolved from a common ancestor, every step in the way was a transition to the present state, and all living forms were transitional forms. There’s no reason to believe evolutionary processes came to a halt just today, so it follows that all extant species are also transitioning.

    Of course not. We have plenty fossils… and they’re all transitional too

    These all ASSUMPTIONS dazz…or rather EXCUSES…
    With 10 billion species transitioning into other species you can’t point one to prove that your claim is true…

    Why?

    I think this goes beyond faith… unless it is another example of miraclevolution…

  9. Tom English: Rumors of my death have been slightly exaggerated.

    I am sincerely pleased to hear that. At least we know the stench here isn’t coming from you. Right?

    😉

  10. J-Mac: These all ASSUMPTIONS dazz…or rather EXCUSES…

    Not really, it simply follows logically: It makes no sense to speak of “non transitional forms” if evolution is true, and we know it’s true because of the huge body of evidence in support of evolution

  11. Mung: Every time I crack open a book on evolution it presents evolution as a problem solver., a designer, an engineer, a search.

    Every time I ask you to specify a problem independently of the ostensible solution process, you deflect with a “clever” remark. Respecting your intelligence more than most of your adversaries do, I have to believe at this point that you know that you cannot produce the specification.

    So why do you keep claiming that there’s a problem, when you cannot write out the problem?

  12. Tom English: So why do you keep claiming that there’s a problem, when you cannot write out the problem?

    One recent example is the problem of how to get at a grub [or other insect larvae] in wood, labelled by the author as “the grub-feasting niche.” The author even asks, “Why has natural selection produced different solutions to the same grubby problem?”

    ETA: Got a problem? Natural selection can solve it!

  13. Mung: One recent example is the problem of how to get at a grub [or other insect larvae] in wood, labelled by the author as “the grub-feasting niche.” The author even asks, “Why has natural selection produced different solutions to the same grubby problem?”

    ETA: Got a problem? Natural selection can solve it!

    Didn’t solve it for the Dinosaurs.

  14. Mung: Every time I crack open a book on evolution it presents evolution as a problem solver., a designer, an engineer, a search.

    Tom English: Every time I ask you to specify a problem independently of the ostensible solution process, you deflect with a “clever” remark. Respecting your intelligence more than most of your adversaries do, I have to believe at this point that you know that you cannot produce the specification.

    So why do you keep claiming that there’s a problem, when you cannot write out the problem?

    Emphasis added.

    Mung: One recent example is the problem of how to get at a grub [or other insect larvae] in wood, labelled by the author as “the grub-feasting niche.” The author even asks, “Why has natural selection produced different solutions to the same grubby problem?”

    ETA: Got a problem? Natural selection can solve it!

    Let’s try something novel: return to the OP. You pointed out, early in the thread, that Glass indicates that the “target” string is an input to his simulation algorithm. I responded that the “target” string is merely a parameter of the simulated process. That is, it is intrinsic to the evolutionary process, not an extrinsically specified objective. If you believe otherwise, then write an explicit specification of the problem that Glass wrote his algorithm to solve. Of course, you cannot define the problem as being solved by whatever it is that tends to occur in the simulated evolutionary process.

    I’m guessing that your response is, “Oh, but look what this high-powered scientist wrote in a work of popular science. He’s done what you just told me not to do.” The reason that my post has been featured is that I put a lot of work into a presentation that goes beyond what you ordinarily see in pop-sci writing, but stays within reach (due to the animations) of non-scientists with a serious interest in evolution. I was exceedingly careful about how I put things (and still see some problems with what I wrote). Now you want to call me to answer for the casual language that popularizers commonly use? I am most definitely not going there with you. Introduction to Evolutionary Informatics is supposed to be rigorous analysis of evolutionary models. The unrigorous remarks in pop-sci books are utterly irrelevant.

  15. dazz: With 10 billion species transitioning into other species you can’t point one to prove that your claim is true…

    You missed to answer the best part!

  16. Tom English: The reason that my post has been featured is that I put a lot of work into a presentation that goes beyond what you ordinarily see in pop-sci writing…

    It had been sitting for almost three weeks without any activity whatsoever.

  17. Mung: It had been sitting for almost three weeks without any activity whatsoever.

    I recognize how unpleasant it is to see the gross failure of the “evolutionary informatics” strain of ID featured. However, the “Charles Darwin of intelligent design” has been invited to comment. And hell hasn’t yet frozen over. Why don’t you contact Marks, and ask how to …

    Tom English: … write an explicit specification of the problem that Glass wrote his algorithm to solve.

    Do you suppose that he can write an explicit specification of the problem that Ofria (joined by others) wrote Avida to solve? Repeating my quotation of Chapter 1 of the book (online),

    There have been numerous models proposed for Darwinian evolution. […] We show repeatedly that the proposed models all require inclusion of significant knowledge about the problem being solved. If a goal of a model is specified in advance, that’s not Darwinian evolution: it’s intelligent design. So ironically, these models of evolution purported to demonstrate Darwinian evolution necessitate an intelligent designer.

    From the preface of the book (online):

    Since Avida is attempting to solve a moderately hard problem, the writer of the program must have infused domain expertise into the code.

    If Marks, Dembski, and Ewert cannot specify the problem that Ofria et al. wrote Avida to solve and the problem that Glass wrote his simulation algorithm to solve, then the project of evolutionary informatics is a total failure.

    Un-sticking my post from the top of the page will not make the matter go away. You would do better to urge Marks to take leave from the culture war, and address the technical issues with someone who understands them well. As you know, I am not an ideologue. Furthermore, I have a history of questioning my own claims, and of admitting to errors. If Marks can specify the problem formally — he and his colleagues have never done so, but have instead referred to the solution set (target), i.e., without indicating how it is specified — then there won’t be much doubt as to whether he has tailored the specification to the evolutionary process.

  18. In addition to that issue, Marks and co. need to be more explicit about their assertions that the computer programs that they discuss use information about the desired solution. In some cases they argue that the parameters of the simulation have been adjusted to get the program to work more efficiently. They make it sound as if this is equivalent to having in the program detailed information about the solution of the problem. This is far from the case.

    Adjusting the constants of (say) a Traveling Salesman Problem solver is a very long way from providing it with the best solution. They never grapple with this seriously, making any adjusting of parameters sound just like feeding a description of a target to the program. It is a rhetorical device for making a targetless program sound like it has been fed a target.

  19. I sent Professor Marks an email back in July, alerting him to Tom’s articles critiquing Introduction to Evolutionary Informatics, receiving an automated response that he was on holiday. I just forwarded the email again in case he missed my earlier headsup.

  20. Joe Felsenstein,

    Disjointed responses along with some free association:

    0. Of course, I don’t mean to leave you out. Marks can get away with saying that interaction with me is beneath his dignity. But he can’t get away with saying the same about you. (Of course, the “dignity” objection does not play very well in the Zone, where a number of very bright people, including you, interact with all comers.)

    1. Perhaps you’ve listened to the DI podcast in which Ewert seems pleased as punch with himself, having come up with the phrase “teleological fine tuning.” Admittedly, it is great progress, for someone who has just published a book referring to a “teleological goal,” to move on to “teleological fine tuning.”

    2. A parameterized algorithm is actually a family of algorithms. Tuning the parameters does not inform the algorithm about the problem, or about anything else. The parameter tuner selects a particular algorithm from the family. You might say more generally that the algorithm selector makes an informed choice of algorithm. But you cannot say that the information of an algorithm-selecting agent resides in the selected algorithm itself.

    3. If Marks et al. were actually interested in engineering analysis, then they would model the possibly-informed decision maker, i.e., the entity that selects an algorithm. They instead model a parody of naturalism as a claim that prespecified outcomes occur totally by chance, and treat rejection of the parody as evidence of design.

    4. It’s important to understand that Marks et al. address one-shot problem solving. The meaning of problem is different than it is when we speak of something like the Traveling Salesman Problem (TSP). The TSP has a number of instances, and when we address how well an algorithm performs in solving TSP, we do not focus on performance for a particular instance. We instead address, say, the worst-case performance or the mean performance over all instances. Furthermore, a TSP solver takes an instance as an input. There is no input to the “searches” of Marks et al. Otherwise, Marks et al. would not be able to model “search” as a probability distribution on the solution space. So, in an entirely concrete sense, the “search” program is not itself solving a problem: there is no input indicating what (instance of a) problem to solve. However, someone may use a no-input program to generate a solution to a problem. The selector of the program may be informed of the problem and the program, but the program is no more informed of the problem than the problem is of the program.

    I could ramble a bunch more, but will stop myself here.

  21. Tom: I am not conerned about being “left out” — in any case, this is my first attempt to grapple with this particular issue. Your example using instances of the Traveling Salesman Problem is a useful one. Imagine we have a front end for the problem which, given number of points n, chooses and instance by choosing n points in the unit square (0,1)\times(0,1) at random. We take a well-tuned TSP-solver that uses, say, a genetic algorithm.

    Now we ask what solution was found, and compare it to the best solution, which we get by exhaustive enumeration, or by branch-and-bound. We won’t have done perfectly with the GA, but we will find solutions a lot better than a tour of points in random order.

    No matter how well-tuned the algorithm is, the tuning happened before we looked at these randomly-chosen instances, and the tuning is not redone each time. So it cannot have transmitted information about the best solution, or even about very good solutions, to particular instances.

    This shows immediately that when Marks et al. imply that the tuning of the parameters consititutes transmission of information about the solution, they are talking through their hats. Let us assume that this is simply a mistake on their part, and that they will publicly admit this.

  22. Joe:

    Let us assume that this is simply a mistake on their part, and that they will publicly admit this.

    🙂

  23. Actually, Tom, I was hesitant to bring this up because I thought perhaps you already planned to cover the issue somewhere in Parts 4-10. As would be appropriate.

    PS typo in my earlier comment: should be “chooses an instance.”

  24. Joe Felsenstein: Actually, Tom, I was hesitant to bring this up because I thought perhaps you already planned to cover the issue somewhere in Parts 4-10. As would be appropriate.

    10!? Why did my right hemisphere just start “humming” the M*A*S*H theme song?

    You know I have a lot of trouble homing in on the most important points, and figuring out how to make them. This helps.

  25. Joe Felsenstein: We take a well-tuned TSP-solver that uses, say, a genetic algorithm.

    How is the solver tuned? Let’s consider an infinite sequence of uniformly random instances (each comprising n points in the Euclidean unit square). We begin with the parameters set arbitrarily. For each instance, we run the solver on the instance, and note the performance. Before solving the next instance, we tune the parameters to maximize the mean performance for all previously encountered instances.

    Ignoring some subtleties that are uninteresting here, assume that the expected performance is increasing over time. Then the parameter tuning (estimation) procedure is, in a straightforward sense, gaining information about the problem. However, at each step the probability is 0 that the current instance is identical to any of the previously encountered instances.

    What I’m trying to do here, following your lead, is to provide a simple example of how a procedure can learn to solve a problem better, though it never sees the same instance twice. You might say, sensibly, that the procedure searches for a good search for solutions to the TSP problem. So what do Marks et al. mean by “conservation of information” in the “search for a search,” given that there’s only shot at solving a particular (instance of a) problem? A randomly selected program probably does not perform much better than randomly selected programs do on average. More specifically, the probability is at most 1/\alpha that a randomly selected program has performance at least \alpha times the average. (You recognized this as Markov’s inequality, after Alan Fox helped me state it more clearly.)

    Why refer to a randomly selected search as a search for a search? A crucial component of what we call search is trial and error. You can see the trial and error in the parameter tuning procedure above. Marks et al. have nothing of the sort in their “search for a search.”

    Now over to you (and perhaps Alan). The back-and-forth is sometimes very productive.

  26. Alan Fox:
    I sent Professor Marks an email back in July, alerting him to Tom’s articles critiquing Introduction to Evolutionary Informatics, receiving an automated response that he was on holiday. I just forwarded the email again in case he missed my earlier headsup.

    You use the term critique very broadly.

    Maybe the word stalker would have been more appropriate.

  27. Tom English,

    Tom,

    Perhaps if you are struggling so much with figuring out what you want to say, it is because you don’t know what you want to say.

    Something to consider.

    Madness takes on many forms.

  28. phoodoo: Maybe the word stalker would have been more appropriate.

    Dr. Marks and his co-authors have put forward an argument in public, in an openly-published book, coming to a very dramatic conclusion that, if true, would invalidate most of evolutionary theory.

    They also make podcasts, give Youtube lectures, and their work is gloated about at Uncommon Descent and at Evolution News and Views.

    So asking questions about this and asking them to come discuss the logic of this in a open forum is … “stalking” ?

  29. Joe Felsenstein,

    You all haven’t even figured out what questions you want to ask yet. You just know you want to ask something, anything, ….please.

    “Guys, guys, I need your help, help me to find something wrong. My academic reputation is at stake! ”

    Crowdsourcing angst is a better description.

  30. @phoodoo:

    Your characterization of our arguments is utter nonsense. We have very specific questions — and can’t seem to get the other folks to answer them.

  31. Tom English: Then the parameter tuning (estimation) procedure is, in a straightforward sense, gaining information about the problem. However, at each step the probability is 0 that the current instance is identical to any of the previously encountered instances.

    To use well-familiar terminology, we have a “training set” and get the algorithm tuned, then it does pretty well on more examples from the same distribution. But even if it does that, it cannot get information from the training set about the solutions for the new examples, can it?

    I take it you are not asserting that the tuned algorithm contains information about the specific examples, just about the distribution from which they come.

  32. To be more specific about what Marks et al. say: in the book Evolutionary Informatics in section 5.2.3.1 (“Back room tuning”) on pages 119-120 they discuss Dave Thomas’s genetic algorithm that comes up with good solutions for a Steiner Tree problem, which is to connect a set of points with a branching tree of minimum length (I have been calling this, incorrectly, a Traveling Salesman Problem).

    Here’s a specific example. David Thomas, an intelligent design critic, offered a software implementation of an evolutionary algorithm he claimed toppled intelligent design.^{31} Thomas wrote^{32}:

    “If you contend that this algorithm works only be sneaking in the answer into the fitness test, please identify the precise code snippet where this frontloading is being performed.”

    So we did.^{33} Many sources of active information were identified. One was code showing that Thomas was using a man-in-the-loop. We identified a smoking gun snippet of code, Here’s an excerpt from our critique:

    They then quote an extended passage from their paper in Bio-complexity in 2012 on Thomas’s program, including the statement that “The claim that no design was involved in the production of this algorithm is very hard to maintain given this section of code.”

    However, the sections of code discussed do not contain any coding of TSP solutions into the program. They involve decisions about search strategy in the design of the program, but definitely do not involve coding of the design of the desired TSP solution into the program. Marks, Dembski and Ewert respond to Thomas’s challenge that they show where in the program Thomas is “sneaking in the answer” by “frontloading” it. They say that they did this.

    They did not.

    (The three footnotes 31, 32, and 33 are respectively to Thomas’s 2010 article in Skeptical Inquirer, to a 2006 post by Thomas at Panda’s Thumb, and to Ewert, Dembski, and Marks’s 2012 paper in Bio-complexity).

  33. Tom English: Now over to you (and perhaps Alan). The back-and-forth is sometimes very productive.

    Well, I’m not sure I can help, being a sceptic on genetic drift. I have been thinking it over and now have this model of the roulette wheel where same sized slots represent neutral alleles and the ball settling in one slot is fixation due to drift. Vary the slot sizes and you have a model of biased sampling, adaptation in action. Another poor analogy, perhaps.

  34. phoodoo: Madness takes on many forms.

    Most forms of madness are called religion. Most of the religiously mad recognize the madness of others, but cannot broach the question of their own madness.

    As for me,

    I am but mad north-north-west: when the wind is southerly, I know a hawk from a handsaw.

  35. Mung: Conservation of Information in Coevolutionary Searches

    Ewert and Marks have ever-so-cleverly redefined the problem. So there is “conservation of information” in what they have decided to address, but not for what is called coevolutionary search in all papers but their own.

    They sorta, kinda, almost admit to what they have done, in the introduction to the paper. It’s hard to believe that they intend for people like you to understand what they’ve done. In fact, the title of the paper is a strong indication that they do not want you to understand. But when someone like me points out that the paper is misleading, they can quote the passage, and then make their meaning clear. They nonetheless have created the appearance of having made good on a claim they made years ago, when they actually have not.

    By the way, if Wolpert and Macready were wrong in “Coevolutionary Free Lunches,” then so was I. You can sorta, kinda, almost guess what I mean by that.

  36. Joe Felsenstein: Dave Thomas’s genetic algorithm that comes up with good solutions for a Steiner Tree problem, which is to connect a set of points with a branching tree of minimum length (I have been calling this, incorrectly, a Traveling Salesman Problem).

    I’d planned on not addressing that. The point I emphasized in my brief review of the book, and need to continue emphasizing, is that the engineering analysis of search does not apply to scientific models like ev, Avida-EQU, and Glass’s model (which is essentially scientific, irrespective of whether it is particularly useful). Lumping the models with problem solvers, as Marks et al. have done, is exactly the wrong thing for me to do. I’m of course interested in analyzing evolutionary algorithms for solving problems. That, and not theoretical biology, is something I am fairly well qualified to do. But I have to emphasize that evolutionary models are not evolutionary searches for solutions to problems. The biggest problem I have is conveying the difference in plain language.

    I’ve just used animations to counter the false intuition that evolution searches. Of course, that proves nothing. I had promised readers a couple of visual posts, without math. Now I need to explain what the math addresses, so I can show that what scientific modelers do is not what the math addresses. However, I find myself wanting to give the math the criticism it richly deserves. It seems designed to make simple results seem much more impressive than they actually are. Dembski and Marks actually recognized that they were addressing the Beta distribution in the Vertical No Free Lunch Theorem. But they buried the Beta distribution deep in a long and complicated argument, much as George Montañez buried Markov’s inequality deep in long and complicated arguments. And despite having recognized that they were addressing the Beta distribution, they gave an “information-theoretic” approximation to the (logarithm of the) upper-tail probability. They do not provide any sort of explanation of the significance of the expression. It certainly does look as though they were hiding the fact that there is a simple and exact expression of the tail probability, and that it does not match anything in information theory.

    phoodoo, isn’t it all the more embarrassing for the Wizards of ID that a mental case like me should show that their math is grossly overcomplicated, and that their “information-theoretic” approximation is window dressing? You might want to reconsider your rhetorical strategy. If the Wizards of ID in fact failed to see what I have, then I’ve got to be a super-genius, or they’re not the geniuses they’re supposed to be. Perhaps your line should be, “Yeah, he’s a super-genius, but he’s also super-crazy.”

  37. Joe Felsenstein: I take it you are not asserting that the tuned algorithm contains information about the specific examples, just about the distribution from which they come.

    To connect this to what I wrote earlier, the parameterized algorithm is actually a family of algorithms. For each instance in the ensemble, the tuning procedure selects an algorithm from the family. There is a corresponding random sequence of performance values, and I stipulated that the expected performance is increasing (i.e., approaching an asymptote). I’m saying that the choice of algorithm is informed, not the algorithm. It is the tuning procedure that is gaining information — information about the expected performance of the algorithms in the family. The mean performance of an algorithm on the instances of the past is an estimator of the expected performance of the algorithm on the next (random) instance. Avoiding a number of details, I am saying that the mean performance of an algorithm on past instances converges to the expected performance. I’m not versed in statistical estimation theory, but I know enough to know that information is well defined in this circumstance. The tuning procedure gains information about the expected performance of the algorithms as the sample of instances grows.

    Again, I’m trying to illustrate informed selection of an algorithm used in search of a solution to a problem — a problem with multiple instances (in this case, uncountably many instances). The procedure selects an algorithm with information of the expected performance for each algorithm in the family. I can see the sense in referring to it as a search for a search. I can’t make similar sense of what Marks et al. call the search for a search, in part because the problem is not, as it is for TSP (or the Steiner tree problem), a set of instances. The problem is just one object, and there’s only one shot at solving it.

    ETA: What we generally would see is that the expected performance is non-decreasing. I stipulated increasing for simplicity.

  38. Tom English: By the way, if Wolpert and Macready were wrong in “Coevolutionary Free Lunches,” then so was I. You can sorta, kinda, almost guess what I mean by that.

    You’re just jealous because Ewart and Marks got published in such a prestigious journal.

    #SourGrapes

  39. Tom English,

    I think we’re in agreement on the SfS. As far as the statistical theory goes, the performance in the Steiner Tree case is simple to deal with if we choose instances of the ST problem independently from the same distribution, such as sets of 100 points randomly chosen in a unit square. The measure of performance, the total length of the ST found, is bounded below (by zero) and above (loosely, by 100 \sqrt{2}) so it has no ill-behaved “heavy tails”. All moments are finite. So the statistical behavior is rather simply analyzed.

  40. Mung: You’re just jealous because Ewart and Marks got published in such a prestigious journal.

    The vagueness of the admission, along with its mismatch of the title, suggests that a reviewer required that the authors explain what they actually had done, and that the authors did as little as possible to persuade the editor that they had made the required change. (The title of the manuscript probably was not subject to review. I can recall only one case in which I was asked, as a reviewer, to judge the appropriateness of the title.) Far be it from me to say that BIO-Complexity reviewers never take authors to task. I judge each article independently of the others because the review process can vary greatly from one submission to the next.

Leave a Reply