Review: Climbing the Steiner Tree

Overview

Winston Ewert, William Dembski, and Robert J. Marks II have made available a paper titled Climbing the Steiner Tree — Sources of Active Information in a Genetic Algorithm for Solving the Euclidean Steiner Tree Problem wherein they claim to have identified sources of “active information” in a genetic algorithm used to find solutions to a Steiner problem. The problem referenced in this paper was originally described by Dave Thomas on the Panda’s Thumb almost six years ago. I developed a GA to solve the problem that Dave posed as a challenge a few weeks later, as did a number of other people.

This paper suffers from numerous flaws, starting with a fundamental mischaracterization of the purpose of Thomas’s solution and including general misunderstanding of genetic algorithms, misapplication of the No Free Lunch theorems, spurious claims about “active information”, and incorrect and unsupported assertions regarding the impact of certain GA implementation details.

Targets

Right from the abstract, this paper mischaracterizes Thomas’s intentions:

In particular, the success of such search algorithms is said to show that intelligent design has no scientific value.

Thomas never claimed that GAs showed that ID has no scientific value. His goal was simply to demonstrate the fact that GAs don’t need explicit targets.

(As an aside, it is true that ID has no scientific value but that is because ID proponents explain no empirical observations, articulate no scientific theory, and make no testable predictions. ID is simply a vacuous argument from incredulity.)

The authors of this paper do appear to grudgingly recognize that typical GAs are not like Richard Dawkins’ well known Weasel program:

Nevertheless, many simulations do not contain a fully articulated solution hidden behind lines of code that can be readily reconstructed by inspecting the program without running it.

Speaking of weasels, this phrasing does have the mustelid nature. The authors could have been more forthright and simply noted that most GAs specify only the problem and a fitness function of some sort. The solution is completely unknown to the programmer in most cases.

Even this rhetorical squirming, though, is insufficient for the authors who add a caveat immediately:

Rather, its code is front-loaded with specialized information for how to find that solution. Instead of the explicit answer, the means by which the eventual product can be found is front-loaded.

The authors are inadvertently correct here. The evolutionary mechanisms being modeled by the GA (“the means by which the eventual product can be found”) are indeed useful in finding solutions to some problems.

Properties of Genetic Algorithms

There are several statements in the paper that suggests the authors have some fundamental misunderstandings of genetic algorithms and the observed phenomena on which they are based. As early as page 2, for example, is this:

The Darwinist claim is that no such assistance is required. Rather, natural selection is innately capable of solving any biological problem that it faces.

No “Darwinist” (a term that reflects ID’s creationist roots) claims this. Extinction is known to happen.

Another, from page 4:

This genome is of a fixed length, always containing the coordinates and connections for the maximum number of interchanges. This simplifies the genetic algorithm because it not [sic] necessary to implement deletion or insertion mutations.

While the design of a GA engine is somewhat simplified by limiting genomes to a fixed length, such a constraint doesn’t prevent modeling insertions and deletions. Further, the lack of insertion and deletion mutations actually makes finding a solution more difficult by limiting the number of evolutionary mechanisms that can be brought to bear.

On page 5 we find:

Does the genetic algorithm have the ability to solve the problem given only a description of the same?

This question demonstrates a profound confusion about what is being modeled. The authors almost seem to be suggesting that they expect a GA to find a solution without any feedback whatsoever. That would certainly fail to reflect the biological situation.

In fact, the fitness function is a description of the problem domain, corresponding to the environment of biological evolution. In that sense a GA does have the ability to solve some types of problems “given only a description of the same.”

No Free Lunch Theorems

No paper authored by Dembski would be complete without some mention, however gratuitous, of the No Free Lunch theorems:

Conservation of information theorems, such as the No Free Lunch Theorems, place limitations on the abilities of search algorithms. Any search algorithm uses resources to try to find its target. Different algorithms exist, but as long as they avoid searching the same place multiple times, given the same resources they will have the same average performance. As a result, excluding algorithms that visit the same place multiple times, no algorithm can be claimed to be superior to any other algorithm. At best, it can be superior over some subset of all fitness functions.

First, the NFL theorems have nothing to do with “conservation of information.” Dembski’s purported “Law of Conservation of Information” has been demonstrated to be bogus.

Second, this is a typical gross misapplication of the NFL theorems by Dembski et al. The NFL theorems say that averaged over all fitness landscapes, no search algorithm is superior to random search. Over a specific fitness landscape, however, some algorithms may well be significantly better than others. Given that GAs, like the natural systems on which they are based, operate within a single fitness landscape (albeit one that is constantly changing, particularly in the real world case), the NFL theorems are simply a red herring in this discussion.

Active Information

The authors define “active information” as:

I+ = -log2(p/q)

where p is probability of a “baseline” search finding a solution and q is the probability of the GA finding a solution.

A baseline search is typically taken to be a single random guess at the answer.

“Active information” is thus simply a comparison of the efficacy of a particular search algorithm to that of a random search using an equiprobable distribution. The term is misleading in that it doesn’t correspond to any commonly used definitions of “information” nor does it demonstrate the “active” involvement of any external actor.

In order to determine how Thomas’s GA compares to a random search algorithm, the authors ran a relatively large number of tests:

In what follows we will be comparing Thomas’s genetic algorithm to that of a random query search. Data on the performance of the search algorithm was obtained by Monte Carlo simulation. In the case of genetic algorithms, we ran 5615 distinct simulations, keeping track of how many times a particular cost was found. In the case of random query searches, we ran 11,228,106,000 random queries to determine the empirical probability distribution for the cost of a random solution. The actual performance of a random query search was calculated by using probability theory and this empirical distribution.

“Probability theory and this empirical distribution”? With over eleven billion data points, why did the authors not simply use the observed values?

The results reported on page 5 of the paper are surprising:

Figure 4 shows the performance of the genetic algorithm presented by Thomas as compared to a random query algorithm. For Thomas’s algorithm, each generation is run for 1000 generations giving a total of 2,000,000 queries or cost evaluations performed. That is, the algorithm considers two million distinct solutions and measures the cost of each solution. At the end of that process the best solution found during those queries is produced. The random query algorithm makes 2,000,000 random queries and reports the best of all of those queries. Figure 4 shows that obtaining a low cost solution to the Steiner tree problem is not as hard as it would at first appear.

The random query algorithm produces a cost of 1246 with probability slightly lower than 0.7.

The authors seem to be claiming that two million random searches will yield a reasonable solution to the Steiner problem nearly 70% of the time. That’s hard to reconcile with further details provided on the very next page:

Baseline for good and adequate thresholds.
We empirically estimated the probability of both the good and adequate thresholds by running Monte Carlo experiments that tested random solutions. 11,228,106,000 random queries were performed to obtain an empirical estimation of the probability of finding the target. The probabilities for finding the good and adequate targets were 1.7812e-09 and 1.9905e-07, respectively. The empirically determined probability for finding the optimal target was zero, because none of the random queries was successful in finding the target.

The 70% chance of success of the random search algorithm seems extremely high. The authors provide no details of their calculations or code, so I have attempted to replicate their findings based on the description provided and my own GA solution to the Steiner problem.

My Steiner solution is based on my GA engine that operates on binary strings. The number of bits required by the genome for the five node Steiner problem is 199. I use a tournement selection mechanism to populate the next generation:

  1. Select two genomes at random, making the most fit parent one
  2. Select two more genomes at random, making the most fit parent two
  3. Mate the two parents by crossing over at a single point to
    generate two offspring
  4. Mutate each of the offspring

My simulation routinely finds a near optimal solution with a fitness cost of 1220 or less using a population size of 1000 within 250 generations.

Each generation of my simulation therefore requires 4000 fitness evaluations, or one million in total for a typical run. Based on the results reported in this paper, I should find that nearly 70% of the time one million random genomes should contain at least one fairly good solution.

Here’s the code to create a million random genomes and return the most fit, using my implementation:

(let* ((genome-length (genome-length *five-point-problem*))
       (best-fitness
        (fitness *five-point-problem* (make-genome genome-length))))
  (format t "~A~%"
          (dotimes (i 10000 best-fitness)
            (let ((random-fitness
                   (fitness *five-point-problem* (make-genome genome-length))))
              (when (< random-fitness best-fitness)
                (setf best-fitness random-fitness))))))

My simulation adds a penalty of 2000 for solutions that don’t connect all fixed nodes. The results of running this forty times are:

4500.08   4394.203  4466.51   4707.275
4430.624  4690.22   4847.232  4833.6436
4889.9487 5147.8467 4929.9795 4677.0977
5248.128  4940.031  4542.566  4555.242
4993.9985 4707.127  4450.148  4792.4023
4647.6304 4800.055  5224.4985 4616.3096
5000.876  4788.144  4961.3896 4383.1436
4888.9614 4577.821  3791.8147 4739.0806
4846.8315 4572.504  4951.1255 4481.5996
4690.18   4426.3213 5097.7544 4948.6284

The best fitness found was 3791.8147, not remotely close to the optimum of 1212 or the 1220 routinely generated by the GA.

The authors of the paper do not provide a description of the algorithm they used or the code, so it is not possible to determine how to replicate their results. As demonstrated here, though, their claim that a collection of random strings, similar in size to the number of fitness calculations used by a GA, has a 70% likelihood of containing a near optiomal solution is not supported.

On page 6 of the paper, the authors plug this dubious value into some poorly explained forumlae and claim to thereby have derived “the integral cost of the solution produced by a search algorithm making 2,000,000 random queries.” This section could use considerably more justification, to say the least.

Ultimately, aside from demonstrating a distinct lack of rigor on the part of the authors, this topic is as much of a red herring as the No Free Lunch theorems. At most, when calculated carefully, “active information” is nothing more than an indication of how a particular algorithm performs relative to an equiprobable random search in a particular fitness landscape. Despite the misleading name applied to this concept, it does not indicate that any information was provided to the non-random algorithm, actively or otherwise.

Implementation Details

Beginning on page 7 the authors discuss implementation details that, they claim, provide information about the solution to Thomas’s GA.

Interestingly, this is very similar to an approach that Dembski used unsuccessfully a few years ago. When discussing Thomas Schneider’s ev simulation, Dembski noticed an implementation detail related to tie breaking. He claimed that:

Schneider is here fine-tuning his evolutionary algorithm to obtain the results he wants. All such fine-tuning amounts to investigator interference smuggling in complex specified information.

Schneider put Dembski’s claim to the test. The result was “For the condition where both organisms survive a tie, the ev program still produces the same graph as previously published, given the same parameters, demonstrating that the code was not significantly altered.”

One might think that Dembski would have learned from that experience to test his claims before making them public. One would be incorrect.

“Effect of count of interchanges”

The authors claim that having an explicit count of the number of non-fixed nodes in the virtual genome “shifts the distribution of solutions.” They also note that Thomas’s code ensures that a minimum number of non-fixed nodes (“interchanges”) are present.

My implementation contains no such explicit count nor does it specify any minimum number of non-fixed nodes, yet it routinely converges on an optimal or near-optimal solution.

“Effect of restricted initialization”

According to the authors, Thomas’s code restricts the initial location of non-fixed nodes to a smaller area than the full 1000×1000 grid allowed by the simulation. Again, my implementation has no such restriction and yet it routinely converges on an optimal or near-optimal solution.

“Effect of selection skew”

The authors begin this subsection with:

As noted, it is easy to obtain a solution with a score of 1246.

In fact, as demonstrated above, the claim that near-optimal genomes are common enough to be frequently present in collections of millions of random genomes is false.

The authors go on to discuss a “skew parameter” used in Thomas’s code. They make the claim, without any logical or empirical support, that:

Without the skew parameter there isn’t enough selective pressure to effectively separate scores of close value.

Not only is this unsupported, it is demonstrably false since my implementation contains no such parameter and yet readily distinguishes between “scores of close value.” This is easily seen when monitoring its progress generation by generation.

“Effects of crossover”

In this section the authors go to some effort to score an own goal. Rather than limiting themselves to the single point crossover implemented by Thomas, they also discuss two point crossover and, bizarrely, what they call “uniform” crossover. This variant, which consists of randomly picking an element from each parent string in order to generate a child string, would not be recognized as crossover by anyone familiar with GAs.

Be that as it may, they then provide a table of crossover related “active information” in Table 3. Precious little detail is provided on how these values were calculated.

The own goal is apparent when one steps back from the detailed, albeit odd, discussion and notes that crossover is observed in nature. The authors are essentially admitting that known evolutionary mechanisms can create “active information.”

Alternative Representations

On page 10 the authors raise their only pertinent point in the entire paper:

Crossover works well because of the way the genome is structured. The genome is built using a structure consisting of a list of locations as well as a connection map indicating which cities and interchanges are connected. However, the ways in which the genome could be structured are limited only by the developer’s creativity. A number of other possibilities also exist:

  • Every city and interchange could have a list of other
    cities/interchanges to which it is connected.
  • The genome could be a sequence of road segments specified by start and stop points
  • The genome could have been represented as a bitmap where each bit indicates whether or not an integral coordinate was on a road.

None of these methods were chosen, and none of them would seem to be useful representations. But they are rejected, not because something is intrinsically wrong with them, but because the developer uses his knowledge of the problem to come up with the best way to represent the solution. The programmer chooses the representation that he believes will be effective and part of that is choosing a representation that works well with crossover.

As with their other claims, the authors provide no suggestion that they implemented a fitness function using any alternative representation. In fact, I suspect that a GA could successfully operate on any of the three alternatives. Despite the authors’ incorrect reasoning, though, this is an interesting point.

When I implemented my solution to the Steiner problem I wasn’t thinking in terms of a fitness function that could easily be traversed by point mutations and crossover. I simply wrote down the first binary format that came to mind — my primary concern was my own ability to decipher it. That aside, the authors make a claim that is almost correct:

The crossover and genome structure work together. As such, their selection by the developer constitutes the use of prior knowledge about the problem in order to help identify better solutions.

Even if the fitness landscape were deliberately chosen to be easily traversed by the subset of known evolutionary mechanisms modeled by the GA, this would not be a way of smuggling in information about any possible solution. No additional information about the problem or the particular solution would be provided. Such a deliberate choice would simply be modeling a landscape that, like what we observe in nature, has characteristics that make known evolutionary mechanisms effective search techniques.

When modeling evolution as a search (which has its own issues outside the scope of this discussion), what the No Free Lunch theorems tell us, contra Dembski, is that the evolutionary mechanisms we implement will be far more effective at finding a solution in some types of fitness landscapes than in others. Because we observe these mechanisms working in the real world, it is a reasonable inference that the incredibly complex and ever changing fitness landscape we call reality is one of those in which those mechanisms are efficacious. If this were not the case, we would either observe other mechanisms or would not be here to observe at all.

Despite themselves, the authors are recognizing that even a small subset of the mechanisms we observe in the real world is more than capable of generating solutions that are highly fit in certain landscapes.

Summary

The authors completely fail to support their conclusion:

The claim is not that the solution found by a genetic algorithm is hidden inside the algorithm, but rather that the algorithm contains the necessary information sources in order to find a low probability target. This is because in order to develop a successful genetic algorithm, the developer makes many decisions based on knowledge of the problem he is attempting to solve. As such, information is derived from this prior knowledge of the search problem.

Every implementation detail alleged by the authors to smuggle in information about possible solutions is trivially shown to be unnecessary to the success of the GA. Whether they understand it themselves or not, all the authors have shown is that known evolutionary mechanisms are quite capable of generating solutions to problems that are not amenable to random search.

It is no surprise that this paper was published in BIO-Complexity, the Intelligent Design vanity website masquerading as a journal. Given the overall poor quality, particularly with respect to the authors’ failure to provide sufficient information to replicate their results and their further failure to subject any of their claims to testing, it is unlikely in the extreme that it would pass peer review in an actual scientific journal.

118 thoughts on “Review: Climbing the Steiner Tree

  1. Depends on what you mean by “ARE.”

    Potential or actual? Missed opportunities?

    Evolution can only move in steps the size of mutations and insertions.

    Can you name a specific event in the history of life that crosses an unbridgeable gap, citing the before and after, the time and place?

  2. stcordova:

    If you say, yes, then well, thanks for helping creationists.

    If you say no, then, on what basis do you make that claim except by assumption.

    Interestingly asymmetric standard, there. Agree with creationists and no further info required. Disagree, and you gotta justify it!

  3. stcordova
    By the way, since I humored you on his earlier challenge, I have a chanllenge for you that you should accept in reciprocity. 🙂

    I will donate 50 to the New Mexicans skeptic organization that you are a member of if you can write a GA which will converge on the complex specified informaiton string (a password if you will) that I have wrtitten explicitly on a sheet of paper. You have 2 months to solve it and publish your results.  C'mon dave, I humored you, now it's your tun. You're officially challenged to describe it. You have no feed back mechanism to tell you if you're getting warm. But the50 should help your organization reproduce more Darwinists, so according to Darwinist logic, since it will make your organization more fit financially, it ought to imply a GA will find the solution on behalf of your organization. (I’ve heard such perverse reasoning before from Darwinists, that something necessarily evolved because it will make it more fit!).

    Think I’m being too hard. Numerous systems in a biological organism are like lock-and-key, password login systems. They are deeply coordinated. A random password is unlikely to hit a random login. There is no reason a GA would solve this unless it were trivial since there is no feedback mechanism and since password/login’s irreducibly complex.

    And since the absence of some proteins, like insulin or other critical proteins, will result in death — one shot at a solution is like what it might be in real life biology. 🙂 So you have one shot at a solution.

    I look forward to hearing from you. I humored you, now it’s your turn.

    Sorry, Salvador, but I don’t think I’m interested in your “challenge.”

    It has NOTHING to do with “evolution”, after all.

    Recall what Dawkins himself said about his “Weasel” program back in 1986 (Blind Watchmaker

    Although the monkey/Shakespeare model is useful for explaining the distinction between single-step selection and cumulative selection, it is misleading in important ways. One of these is that, in each generation of selective ‘breeding’, the mutant ‘progeny’ phrases were judged according to the criterion of resemblance to a distant ideal target, the phrase METHINKS IT IS LIKE A WEASEL. Life isn’t like that. Evolution has no long-term goal. There is no long-distance target, no final perfection to serve as a criterion for selection, although human vanity cherishes the absurd notion that our species is the final goal of evolution. In real life, the criterion for selection is always short-term, either simple survival or, more generally, reproductive success.

    Show me anywhere in biology where life must adapt to a hidden phrase created by Sal Cordova.

    I have a secret message for you to decipher, if you wish. I am pointing at the screen and your post with one single finger. Guess which one it is!

    😀

  4. stcordova: Think I’m being too hard. Numerous systems in a biological organism are like lock-and-key, password login systems. They are deeply coordinated. A random password is unlikely to hit a random login.

    So apparently you would argue that the nucleons, nuclei, atoms, molecules, solids, crystals, polycrystals, and quasi-crystals are all “lock-and-key password login systems?”

    What about superconductivity and superfluidity? These are highly coordinated systems.

    Why do you apply this characterization to biological systems and not to other complex, highly coordinated systems? Specifically where along the line of increasing complexity of condensed matter does this kind of characterization of “lock-and-key password login systems” begin to apply?

    ID/creationism can’t even explain the existence of solids and liquids, let alone their behaviors and properties within various temperature ranges?

  5. The idea that a GA could converge on an isolated target like a lock combination or cryptogram exhibits a misunderstanding of evolution that is so deep and so profound that it can only be a black hole of misunderstanding, sucking in and dismembering all possibility of communication.

    At the very least it illustrates that you haven’t read or understood anything in these discussions regarding connectability of functional spaces.

  6. stcordova: I will donate $50 to the New Mexicans skeptic organization that you are a member of if you can write a GA which will converge on the complex specified informaiton string (a password if you will) that I have wrtitten explicitly on a sheet of paper. You have 2 months to solve it and publish your results.

    This could only be written by someone who either does not understand how a genetic algorithm works or is not interested in having a productive dialogue. Possibly both.

  7. stcordova: Patrick,

    You were intelligently designing the fitness function weren’t you?

    I’ll be happy to discuss this issue with you if you can explain how it applies to the errors I pointed out in the paper.

  8. stcordova:

    I will donate $50 to the New Mexicans skeptic organization that you are a member of if you can write a GA which will converge on the complex specified informaiton string (a password if you will) that I have wrtitten explicitly ona sheet of paper.

    You write this as though the term “complex specified information” has a referent in the real world. Whenever I have asked an ID proponent to define CSI with sufficient mathematical rigor to allow different individuals to calculate it for the same artifact and obtain the same result, I have not received an answer. Can you do so?

    In the context of this topic, what is the CSI of the bit string representing the optimal solution to the Steiner problem posed by Dave Thomas?

  9. Dave,

    Your programs don’t have anything to do with natural selection nor the modern synthesis.

    When a programmer writes a program to do something, and it does it, then it did so by design, Dave.

    ID says evolution proceeds by design, Dave- as in a GA/ GP

  10. stcordova:
    Patrick wrote:

    No “Darwinist” (a term that reflects ID’s creationist roots) claims this.

    So are you saying there ARE complex biological systesm which natural selection can’t create.

    If you say, yes, then well, thanks for helping creationists.

    If you say no, then, on what basis do you make that claim except by assumption.

    While I readily admit that I’m always working to improve as a writer, I don’t see how one could, in good faith, put those particular words in my mouth based on what I wrote. Here’s the full context of the line you quoted:

    There are several statements in the paper that suggests the authors have some fundamental misunderstandings of genetic algorithms and the observed phenomena on which they are based. As early as page 2, for example, is this:

    The Darwinist claim is that no such assistance is required. Rather, natural selection is innately capable of solving any biological problem that it faces.

    No “Darwinist” (a term that reflects ID’s creationist roots) claims this. Extinction is known to happen.

    Pointing out that species are known to go extinct is just one easy way of identifying the error with the authors’ claim. Some problems, significant ecological change for example, clearly aren’t solved by known evolutionary mechanisms, particularly under a time constraint.

    This statement is a baseless, clearly incorrect assertion, one of many errors in the paper.

  11. LoL! CSI only applies to that which can be broken down into bits. And all you do is see if it exists or not. IOW chalk up CSI as another ID term Patrick doesn’t understand.

  12. Some problems, significant ecological change for example, clearly aren’t solved by known evolutionary mechanisms, particularly under a time constraint.

    Evolution can only really solve the problem of making one runner faster than another when both are being chased by the tiger.

    Doesn’t do much good if there are two tigers.

    Species go extinct. So far, life hasn’t.

  13. Hmmm. The last time I checked, GAs do a great job on problems for which the ‘fitness landscape’ is reasonably smooth/continuous, and are, likewise, known to suck at problems for which the ‘fitness landscape’ is discontinuous — like, say, the one-vertical-spike-rising-out-of-a-zero-fitness-plane ‘fitness landscape’ which is the find this password problem Mr. Cordova proposes.
    Now, why did Mr. Cordova ask Dave to write a GA to solve a problem of exactly the sort which GAs are known to suck at? Mr. Cordova’s answer to this question could be instructive. I say ‘could be instructive’ because if Mr. Cordova’s past track record is at all indicative of his future behavior, the odds of Mr. Cordova answering that question are exceedingly low, if not zero. Did Mr. Cordova do this because he was ignorant of the operational characteristics of GAs, or did he do this because… how shall I put it… he was deploying a rhetorical gambit for some unspoken reason, or what?

    When faced with a commenter who quote-mines the people with whom he’s interacting, the most parsimonious explanation for that behavior is that said commenter is a deceitful son of a bitch. But our hostess’ ground rules prohibit that interpretation, so another explanation must be found. For purposes of discourse on this site, I believe the most parsimonous explanation among those which are allowed here is that quote-miners (and deceitful people in general, particularly those whose track records are filled with deceit) have severe intellectual deficits, perhaps due to brain damage.

  14. I will donate $50 to Sal’s favorite charity if he can name an even number prime greater than 2.

    Haven’t we been at this for decades? Is there no progress at all in the discussion?

  15. petrushka: Haven’t we been at this for decades? Is there no progress at all in the discussion?

    Progress in creationism, that’s an oxymoron.

  16. Sorry, Salvador, but I don’t think I’m interested in your “challenge.”

    Its not a matter of your interest, Dave, but returning a favor. You ASKED me to participate in your steiner problem. I obliged. You obviously are unwilling to reciprocate.

    Is it because you are uninterested or unable to meet the terms of the challenge? 🙂 Besides, you have a $50 incentive for the furtherance of your non-profit skeptic society.

    It seems to me the issue is not whether you are interested, but rather a GA is unable to solve the problem. Is that the case? 🙂 If that’s the case, say so.

    Given that I obliged you in the past at your request, you have to at least confess GA’s can’t solve most problems regarding a search.

    In any case, thanks for displaying your refusal to reciprocate. I suppose you don’t like it that I can pose a challenge for your GA’s which GA’s can’t solve. But you see, that’s the problem. You focus on problems which GA’s can solve and ignore the far larger set of problems which GA’s cannot solve.

    Is the complexity in biology something that biological GA’s can solve or not? Simply showing that the GA can solve a steiner problem is no proof GA’s can solve complexity in biology since I demonstrated, GA’s can’t solve a large class of search problems!

    Even supposing a GA can solve the problem, without the correct fitness function, without the continued operation of the GA (assumption of no extinction of the entire species), without replicators and genes (the assumption of OOL), etc. there is no evoution of complexity.

    Consider the case of Gammarus MInus. Real world GA’s destroyed eyes, they didn’t construct them. So why does one believe selection will select for complexity, when real world examples show that it does not (not the least of which are large examples of extinction and genetic deterioration).

    Your Steiner soluton can’t be generalized to all complexity in biology without great leaps of assumption. Assumption is fine, but it is not the same as fact. I provided a counterexample of systems which GA’s can’t solve.

    You seem disinterested, but the least you could do is say, “GA’s can’t solve many problems unless they have a target to converge on, either explicitly or implicitly described. We don’t know for a fact that biological GA’s can create the complexity we see in biology.”

    The target in the steiner problem was a representation of the physical solution. Unless the description of the constraints were provided (as in front-loaded) into the computer, the program would not converge on a representation of the real-world physics. You failed to point out, the intelligent designer of the code had to choose a correct fitness function, otherwise the GA would not converge on the desired target.

    Anyway, thanks for responding. It was informative.

  17. You seem disinterested, but the least you could do is say, “GA’s can’t solve many problems unless they have a target to converge on, either explicitly or implicitly described.

    You seem incapable of understanding what is being said.

    It is not the existence of the target that is at issue, but the characteristics of the landscape. GAs cannot find an isolated target.

    They can move a population to have more of some characteristic, but they do not need to know about targets or goals.

    If there is only one (yes or no) level of fitness, GAs are not useful.

  18. Patrick:

    known evolutionary mechanisms are quite capable of generating solutions to problems that are not amenable to random search.

    Not true. They are capable of generatings solutions for things they are capable of generating solutions for — a tautology. They are not capable of solving passwords or analogous IC systems without specialized information about the target.

    I gave an example of a search which an evolutionary algorithm won’t do very well. Biology is replete with signalling systems that are analogous to passwords, and lock-key systems. Even wiki-uses the lock-key analogy:

    http://en.wikipedia.org/wiki/Docking_(molecular)

    You can see, the GA needs a lot of computational horsepower to solve the docking problem. This is far more problematic than the steiner problem, and this level of complexity is relevant since at some point if our best computers and software can’t solve the docking problem, why should we expect biological GA’s to do better? In fact, the abundance of extinction and lack of evolving large numbers of new docking solutions suggest the GA’s in the real world didn’t create the lock-key systems in biology.

    I expect we might solve the docking problem with computational GA’s some day, but biology doesn’t work like a human made GA. Real biology destroys species and eyes and all sorts of other complexity. There aren’t real-time observations of real world biology building novelty as fast as it destroys it.

    Real world evolution is destructive not constructive.

  19. stcordova: Is it because you are uninterested or unable to meet the terms of the challenge?

    It may be beneficial and more productive for everybody involved if you would read the ENTIRE answer someone provides: Dave clearly told you why he is uninterested: your “challenge” has nothing to do with evolution.

    stcordova: It seems to me the issue is not whether you are interested, but rather a GA is unable to solve the problem.

    That may well be the case. But, as Dave pointed out, whether or not a GA would be able to solve the problem is uninteresting, because the problem is irrelevant to evolution.

    stcordova: You focus on problems which GA’s can solve and ignore the far larger set of problems which GA’s cannot solve.

    Of course. He focuses on problems that are relevant to evolution. You imply that it is somehow an issue for evolution that GAs can’t solve any and all manner of problems? How exactly is the abilitiy to solve any and all manner of problems in general, or your specific kind of problem in particular, relevant to evolution?

    stcordova: Even supposing a GA can solve the problem, […] without the continued operation of the GA (assumption of no extinction of the entire species), […] there is no evoution of complexity.

    Ummmm, yes. If a species goes extinct, there is no complexity evolving in it any more. There are no traits whatsoever evolving in this species anymore. Are you somehow under the impression that anyone here thinks that evolution occurs in the absence of extant specimens?

  20. stcordova: Its not a matter of your interest, Dave, but returning a favor. You ASKED me to participate in your steiner problem. I obliged. You obviously are unwilling to reciprocate.

    Yes, Dave should have known better.

  21. stcordova: I gave an example of a search which an evolutionary algorithm won’t do very well. Biology is replete with signalling systems that are analogous to passwords, and lock-key systems. Even wiki-uses the lock-key analogy:

    A scientist can read that Wiki example and understand the process and the metaphors used to describe it.

    An ID/creationist reads it literally without the slightest comprehension of the underlying chemistry and physics.

  22. olegt: John Holland, the father of genetic algorithms, thinks otherwise. Adaptation in Natural and Artificial Systems.

    This is the original book from the 70s about the theoretical basis for genetic-style adaptation, and the surprising parallels between evolution, gambling, and learning in general. foundational, clear, mathematical.

  23. stcordova: Real biology destroys species and eyes and all sorts of other complexity. There aren’t real-time observations of real world biology building novelty as fast as it destroys it.
    Real world evolution is destructive not constructive.

    This is a human projecting human aspirations and values onto nature. It is called animism. ID/creationism is a primitive throwback to those early times in the evolution of religion.

    Science has since learned about soft matter. There are no surprises here.

  24. Genetic algorithms are based on mimicking with a computer the processes and laws found in nature.

    But the impressions I get of the vast majority of ID/creationist camp followers is that they don’t comprehend even middle school science; even after some supposedly obtain a college degree.

    Here, for example, are science standards for the state of California.

    As a specific example, on page 136, under “Focus on Physical Sciences” we find some very basic concepts and terminology that are expected of eighth graders.

    I would suggest that most ID/creationist followers don’t know any of this material; nor have they had any of it reinforced with high school chemistry, physics, and biology.

    Even eighth graders are expected to know something about the states of matter. Implicit in that knowledge – and extended and reinforced in high school chemistry, physics, and biology – is the understanding that matter interacts with matter and condenses. Many elements and compounds can exist in solid, liquid, or gaseous states. Elements and compound have names. Elements and compounds have melting points and vaporization points.

    When one gets to chemistry and physics, one can find tables of the melting and vaporization temperatures of elements and compounds. Notice that elements and compounds are not called “water” when they melt. They go by their elemental or compound names. Eighth graders know this.

    This is basic middle school and high school knowledge; but it is knowledge not evident among ID/creationist followers. For a normal, curious student, this basic knowledge leads to questions about deeper properties and behaviors of matter and energy; like, “What are the underlying rules for the structure of matter?” But with ID/creationists, it just produces a blank stare and a series of snarky remarks like, “naturalism can’t explain anything.”

  25. Salvador,

    It’s certainly true that a GA cannot find a small target in a large informational void. Nor can any biological process. Nor can an intelligent designer.

    I posed essentially the same challenge as you on UD three years ago, except I was challenging the human readers to crack the password (actually to reverse a one-way hash — same principle). The point is that humans, just like nature, require “active information” to find solutions. The password example is a good illustration of the fact that Dembski’s “information accounting” framework is useless for distinguishing between intelligent and natural causes.

  26. stcordova:

    known evolutionary mechanisms are quite capable of generating solutions to problems that are not amenable to random search.

    Not true.

    My claim is clearly supported by the data and arguments in my original post and by the existence proof of my working code. Simple crossover and single point mutations converge on a solution that is not readily found by random search.

    If you disagree, please address the data and arguments presented.

  27. stcordova: Not true.They are capable of generatings solutions for things they are capable of generating solutions for — a tautology.They are not capable of solving passwords or analogous IC systems without specialized information about the target.

    I gave an example of a search which an evolutionary algorithm won’t do very well.Biology is replete with signalling systems that are analogous to passwords, and lock-key systems.Even wiki-uses the lock-key analogy:

    http://en.wikipedia.org/wiki/Docking_(molecular)

    You can see, the GA needs a lot of computational horsepower to solve the docking problem.This is far more problematic than the steiner problem, and this level of complexity is relevant since at some point if our best computers and software can’t solve the docking problem, why should we expect biological GA’s to do better?In fact, the abundance of extinction and lack of evolving large numbers of new docking solutions suggest the GA’s in the real world didn’t create the lock-key systems in biology.

    I expect we might solve the docking problem with computational GA’s some day, but biology doesn’t work like a human made GA.Real biology destroys species and eyes and all sorts of other complexity.There aren’t real-time observations of real world biology building novelty as fast as it destroys it.

    Real world evolution is destructive not constructive.

    Ah, so “Real world” evolution does occur but it only destroys things, eh? Gee whiz, your omnipotent, omniscient designer/creator god sure is a nice guy for designing/creating evolution so that it would only destroy things. That’s so loving and merciful.

    According to the other IDiots who accept that evolution occurs, “the designer” designed and programmed the ability to evolve into living things. They claim that “the designer” created, designed, and programmed complex specified information into living things and that it’s that programming that drives everything in living things, which of course would include survival success or failure (are you paying attention joe?).

    So, if evolution is driven by the programming (algorithm) that was designed, created, and installed by “the designer” (and either sometimes, regularly, or constantly tinkered with by “the designer” according to various IDiots), why is it that evolution only destroys but doesn’t construct? How can there be millions/billions of years of life and evolution if the programming was designed only to destroy? How can there be the diversity of life, and its long and diverse history, if “the designer’s” goal was and is to program only destruction into the evolutionary process of living things?

    Did your all knowing, all powerful designer/creator god ignorantly screw up the intentionally destructive programming, and is that how many living things have managed to survive and evolve successfully, or is your designer god just playing a long drawn out sadistic game with living things by letting some of them survive and evolve successfully for awhile until it decides to adjust the programming so that they die out, with the ultimate goal being the destruction of every living thing everywhere?

  28. Joe G:
    GAs are design mechanisms and having nothing to do with natural selection nor evolutionism.

    Even though your entire comment is just the usual nonsensical gibberish, I’ll take a stab at responding to it.

    I notice that you chose the word “evolutionism”. Why did you add “ism”? Why not just ‘evolution’?

    I seriously doubt that any evolutionary scientist would claim that GAs have anything to do with “evolutionism”. They just might claim that the GA programs they use in their computers help them to understand the processes of evolution though. 🙂

    YOU and some other IDiots are the ones who claim that all evolutionary processes themselves (in nature) are completely dependent on and driven by the alleged GAs that you claim were/are designed, created, and installed in living things and the whole universe by “the designer”, and some of you IDiots claim that those GAs are sometimes, regularly, or constantly tinkered with by “the designer”. In other words, according to you and some other IDiots, evolution has everything to do with GAs.

    So, since your comment about “evolutionism” is entirely irrelevant, you must have meant to say the word ‘evolution’ instead, which of course makes you look like the fool you are because YOU constantly claim that GAs are what drives evolution.

    Contradict yourself much?

    One more thing for now joe, going by your definitions, and since you accept that evolution occurs, you must be an evolutionist, which means that you practice “evolutionism”, and since you claim that “the designer” designed and installed necessary GAs in evolutionary processes, you’re actually saying that your “evolutionism” has everything to do with GAs.

    I’d suggest that you make up your mind but as “we” all know……

  29. stcordova: Its not a matter of your interest, Dave, but returning a favor.You ASKED me to participate in your steiner problem.I obliged. You obviously are unwilling to reciprocate.

    Is it because you are uninterested or unable to meet the terms of the challenge?:-)Besides, you have a $50 incentive for the furtherance of your non-profit skeptic society.

    It seems tome the issue is not whether you are interested, but rather a GA is unable to solve the problem.Is that the case? :-) If that’s the case, say so.

    Given that I obliged you in the past at your request, you have to at least confess GA’s can’t solve most problems regarding a search.[

    Nobody disputes this, Salvador. The No Free Lunch theorems are perfectly good theorems. Averaged over all problems, GAs will perform no better than chance.

    The issue is not whether GAs can solve all problems. They can’t, and, specifically, they can’t solve problems that exist in an unconnected fitness landscape, as yours does.

    On the other hand, if you were to give feedback to the algorithm as in “warmer… no colder… warmer again…” it probably could.

    Dembski is therefore absolutely right, but also absolutely irrelevant. The question is not “Can GAs solve any problem?” but “Can GAs solve the kinds of problems solved by biological populations”?

    In any case, thanks for displaying your refusal to reciprocate.I suppose you don’t like it that I can pose a challenge for your GA’s which GA’s can’t solve.But you see, that’s the problem. You focus on problems which GA’s can solve and ignore the far larger set of problems which GA’s cannot solve.

    And this is a demonstration of just how you have missed the point. We are all perfectly happy that you have posed a challenge for GAs which GAs can’t solve. We know there are far more problems that GAs can’t solve than problems that GAs can solve. The reason we “focus on problems which GAs can solve and ignore the far larger set of problems which GAs cannot solve” is that the problems that GAs can solve are precisely the kinds of problems that biological populations solve! I.e. problems in which the similar organism have similar fitness in similar enviroments. Your problem is not one of these, clearly. In a GA that attempted to solve your problem, all possible organisms except one would have identically awful fitness, while one would have perfect fitness. Clearly the GA would no better than random search for that problem.

    Is the complexity in biology something that biological GA’s can solve or not? Simply showing that the GA can solve a steiner problem is no proof GA’s can solve complexity in biology since I demonstrated, GA’s can’t solve a large class of search problems!

    GAs can solve problems where similar genotypes have similar fitness. This is true of the Steiner problem. It’s true of biology. It’s not true of your problem.

    Even supposing a GA can solve the problem, without the correct fitness function, without the continued operation of the GA (assumption of no extinction of the entire species), without replicators and genes (the assumption of OOL), etc. there is no evoution of complexity.

    This makes no sense.

    Consider the case of Gammarus MInus.Real world GA’s destroyed eyes, they didn’t construct them.So why does one believe selection will select for complexity, when real world examples show that it does not (not the least of which are large examples of extinction and genetic deterioration).

    Selection doesn’t “select for complexity”, at least not complexity in the sense in which you seem to be using the term (by definition, a selected solution will have “specified complexity”) i.e. a complex structure. Selection “selects for” reproductive success. In fact that’s tautological – “selection” simply is the phenomena whereby genotypes that tend to give rise to more reproductively successful phenotypes become more prevalent. If simplicity contributes to reproductive success then simplicity will be “selected for”. The simple solution may nonetheless have “specified complexity”, just as a row of 100* heads has “specified complexity” despite the fact that it is a far “simpler” sequence than a pattern of heads and tails.

    Indeed, Dembski’s informal description of a “specified” pattern is an “easily described” pattern. So I think you have fallen foul of an inadvertent equivocation here.

    Your Steiner soluton can’t be generalized to all complexity in biology without great leaps of assumption.Assumption is fine, but it is not the same as fact.I provided a counterexample of systems which GA’s can’t solve.

    Indeed you did. And I’m sure any of us could readily come up with thousands more. We could even point to the NFL theorems to demonstrate that they must vastly outnumber the systems that GAs can solve! You are preaching to the choir, here!

    You seem disinterested, but the least you could do is say, “GA’s can’t solve many problems unless they have a target to converge on, either explicitly or implicitly described.We don’t know for a fact that biological GA’s can create the complexity we see in biology.”

    No, because that would not be true. The issue is not whether “they have a target to converge on” but whether fitness space is connected. There could be many solutions to a problem, and yet GAs could reliably find some but not others, depending on which solutions are connected in fitness space. For example, wheels would be a great solution to mobility for multicellular animals, but none of us have wheels. That is probably because wheels are in an unconnected part of fitness space, and can only be reached by intelligent agents (e.g. us). For unicellular animals, however, they seem to be connected, and indeed, bacteria have evolved rotors.

    The target in the steiner problem was a representation of the physical solution.

    There was no target. There was only a set of criteria – a problem to solve – which the Steiner solution happens to meet/solve better than any other. Sometimes the solution reached was the Steiner solution, sometimes it wasn’t. What happened, reliably, was that final solutions were better than “starting” solutions. Exactly as in biology, which has no “target”, merely problems to solve (reproducing within the current environment, its resources and hazards), and solutions that are better than other solutions.

    Unless the description of the constraints were provided (as in front-loaded) into the computer, the program would not converge on a representation of the real-world physics. You failed to point out, the intelligent designer of the code had to choose a correct fitness function, otherwise the GA would not converge on the desired target.

    You have got this backwards, and as a result, your argument is circular. Clearly a program cannot solve a problem unless you present it with the problem!

    You have identified three entities where there are only two . You think there is a Problem, a Fitness Function and a Target. All there is the Problem and the Solutions. The problem IS the fitness function, and the “Target” is any solution that solves the problem better than some other solution.

    In Patrick’s GA, the problem is minimising the lengths of the connectors between the nodes. Therefore the fitness function is “small lengths of the connection between the nodes”. The solutions (“targets”) are solutions with small lengths of connection between the nodes.

    In biology, the problem is something that enhances reproduction (which could be minimising something like surface area to volume). The fitness function is therefore “small surface area to volume”. The solutions are anything that results in small surface area to volume.

    The “active information” lies simply in the problem. It doesn’t need to be put there “intelligently” (although clearly, in a computer GA it is). It just needs to be a set of environmental resources and hazards in which a population of self-replicators is situated.

    Anyway, thanks for responding.It was informative.

    It was, but I think you misunderstood the information 🙂

    ETA: *500 if it is to fall within Dembski’s rejection region for his chance hypothesis. Thanks to Joe F for pointing this out.

  30. Elizabeth, that was a very clear, very appropriate, and very telling criticism of Sal Cordova’s statements about genetic algorithms and of his “challenge”. Sal has avoided dealing with the original post, which showed clearly s csse where GAs do far better than random search.

    I’d modify your statements in two ways, however. First, Dembski’s Complex Specified Information is achieved by tossing 500 heads in a row, not 100. Second, Dembski is not “absolutely right” about the No Free Lunch Theorem. He jumps from behavior averaged over all possible fitness surfaces to the unsupported conclusion that this behavior is relevant to the fitness surfaces found in biology. He has never said “I guess I was wrong about that”.

    Will Sal Cordova address the points in your comment? The point In the original post?

  31. I notice that you chose the word “evolutionism”. Why did you add “ism”? Why not just ‘evolution’?

    Because I do not want to equivocate- ID is OK with ecvolution. Evolutionism refers to the untestable premise that necessity and chance produced the diversity of life starting from some population(s) of prokaryote-like organisms.

    They just might claim that the GA programs they use in their computers help them to understand the processes of evolution though.

    Yup, evolution by design.

    YOU and some other IDiots are the ones who claim that all evolutionary processes themselves (in nature) are completely dependent on and driven by the alleged GAs that you claim were/are designed, created, and installed in living things and the whole universe by “the designer”, and some of you IDiots claim that those GAs are sometimes, regularly, or constantly tinkered with by “the designer”. In other words, according to you and some other IDiots, evolution has everything to do with GAs.

    Man you are dense. GAs = evolution by design.

    Obvioulsy you have serious issues…

  32. GAs are design mechanisms, Elizabeth. They do not mimic nature. They do not mimic natural selection.

    BTW the target is that problem to be solved…

  33. Joe G:
    Well reality says otherwise oleg. There are no targets/ goals in nature, duh.

    You’d think that by now Joe G would have figured out that genetic algorithms don’t have targets. Amazingly, he hasn’t. Simple ideas have trouble penetrating six inches of bone.

  34. Joe G:
    LoL! Of course GAs have a target/ goal- they are written to solve problems.

    No, Joe. Solving a problem is the goal of the person who writes a genetic algorithm. Not of the algorithm itself.

  35. Joe G:
    GAs are design mechanisms, Elizabeth. They do not mimic nature. They do not mimic natural selection.

    This is simply incorrect, Joe. GAs do mimic nature. That’s why they are called GAs.

    BTW the target is that problem to be solved…

    And if you define “target” as “the problem to be solved” then you agree with my comment to Sal that there are only two entities here, not three: the problem aka target; and the solution.

    GAs find solutions to problems. You do not “smuggle in” to a GA the solution by providing the problem, any more than an exam paper “smuggles in” the answers in the form of the questions!

  36. Again if a programmer writes a GA to solve a problem and the GA soves it, then it solved it by design.

    Solving a problem is the goal of the person who writes a genetic algorithm. Not of the algorithm itself.

    They are one in the same. The GA is an extension of the programmer.

  37. Yes, if you use his definition that includes his UPB, it will take 500 heads in a row to hit the jackpot 🙂

    Thanks.

  38. GAs might be said to mimic nature, but they do not. They mimic engineering trial and error.

    Yes GAs find solutions to problems because they are designed to do so. And we have already been over the other stuff- why do you bring it up, again?

    No need to have the specified target nor the specified path- geez read the paper Patrick is trying to refute.

  39. Once again, from the paper-

    The above quote shows that Thomas is under the misapprehension that intelligent design advocates claim that the actual answer is encoded into the algorithm. This is not in fact what intelligent design advocates claim, as will be shown later. Rather, we say that success is due to prior knowledge being exploited to produce active information in the search algorithm. Although the code does not include the actual Steiner shape, it does include a tuned algorithm for how to find the Steiner shape.

  40. Joe G:
    If a programmer writes a GA to solve a problem, and it solves it, then it did so by design, period.

    He does, but that is beside the point. The designer employs a genetic algorithm to achieve that goal. The algorithm has no goal. The algorithm relies on heritable variation and selection. This is what a genetic algorithm does. By definition.

    A word of advice, Joe. Learn definitions before attempting an argument. 🙂

  41. The genetic algorithm has the same goal as the designer as the GA is just an extension of the designer.

    But I understand that you have to engage in semantic quibbling seeing that you have nothing else…

Leave a Reply