Search space

A curious couple of days for me at Uncommon Descent!

I’ve always wondered why people who are skeptical about Darwinian evolution aren’t persuaded by the power of evolutionary algorithms to find creative solutions.

There seems to me to be a deep misunderstanding of a) the nature of search space (and its structure) and b) the role a Designer plays in a GA.

The oddest objection I find is the objection that the Designer designs the fitness function.  If the ID of ID is the fitness function, then ID is the environment!  Because that’s the analog surely.

Which is nicely Earth Motherish, I guess.

But almost equally odd is the apparent objection that because the Darwinian algorithm is programmed into the computer, that makes it designed.

A bit bemused.
But this notion of the structure of the search space seems to me to be important.  I’m trying to think of a search space that wouldn’t be easily traversed by a Darwinian algorithm.  I guess one that consisted entirely of irreducibly complex features, where nothing worked at all unless it was complete.

OK, that helps.  But in that case the whole needle-in-a-haystack argument depends on most things being IC.  Or at least crucial things being IC.  Including, of course, self-replication itself.  Or at least Darwinian-capable self-replication.

 

 

 

109 thoughts on “Search space

  1. It’s hard to tell if the objection concerning GAs being components of a computer program makes them designed is the result of genuinely not understanding the difference between a program and the output of a program or whether such is just an oft-repeated (and now accepted by most who don’t think about it) talking point. I tend to presume the former given that I don’t like to think people are that lazy/intentionally evasive, but given some of the statements made I could be wrong.

    I seem to recall a similar argument being made by a doctor (I don’t recall whom, but perhaps Dr. Egnor) associate with Uncommon Descent that experiments concerning the evolution done in the lab could not be used to support evolution because the labs were designed. I really couldn’t understand how a doctor could confuse the design of the lab, or even the design of an experiment, with outcome of the experiment, yet here was one doing so. Would he equally insist that studying cadavers in a lab offered no understanding to treating living folks in a hospital?

    It really boggles my mind that there are folks who don’t get the distinction.

  2. I’ve always wondered why people who are skeptical about Darwinian evolution aren’t persuaded by the power of evolutionary algorithms to find creative solutions.

    I don’t find that at all surprising. I don’t have doubts on evolution, but I remain undecided about the algorithms. The trouble with that kind of algorithm, is that it depends on a simulated model. So there are always questions about whether the model is right.

    There seems to me to be a deep misunderstanding of a) the nature of search space (and its structure) and b) the role a Designer plays in a GA.

    That illustrates why I am undecided. I sometimes say that my view of evolution is non-Darwinian. I am skeptical that “search space” properly describes things.

  3. Neil Rickert: I don’t find that at all surprising.I don’t have doubts on evolution, but I remain undecided about the algorithms.The trouble with that kind of algorithm, is that it depends on a simulated model.So there are always questions about whether the model is right.

    That illustrates why I am undecided.I sometimes say that my view of evolution is non-Darwinian.I am skeptical that “search space” properly describes things.

    Can you explain?

  4. I’m trying to think of a search space that wouldn’t be easily traversed by a Darwinian algorithm.

    The classic such space I’ve seen referenced is an encryption key. Someone described it as “a needle standing up in a flat desert.” One unique solution with no hill climbing possible.

  5. You’ve done it now!

    Now you have suggested the possibility that a search space might consist entirely of irreducibly complex features, I’ll bet you’ll find that that is the position a number of the IDists will now adopt, whilst claiming that that’s it’s obvious that it’s exactly what they meant all along.
    I see you’re none too popular over there. “Look – the Emperor has no clothes!” you cry. In doing so, you’ve accomplished what everyone thought was near impossible – you’ve made Uncommon Descent funny

  6. Elizabeth: Can you explain?

    Explain which part?

    I’ll assume you are asking about my comments on “search.” I’m not a biologist, so I’m not exactly sure how the term “search” is being used. Part of what I disagree with about neo-Darwinism, is the idea that it is based on optimizing fitness. There’s no need to optimize. Suboptimal is fine, as long as the replacement rate is sufficient to maintain a population.

    Here’s some background. I have been studying learning. I started with an interest in human learning, though I don’t restrict myself to humans. Having developed a theory of learning, I tried to apply it to evolution, treated as a learning system. And it seemed to fit rather well.

    If I look at learning theories in AI (artificial intelligence), they are mostly based on search. There are lots of research papers on machine learning. But it doesn’t work very well at all. In particular, it isn’t anything like human learning. That’s partly why I am dubious about search.

    My ideas on evolution are possibly in a similar direction to those of James Shapiro (at U. of Chicago), though I base this on limited knowledge of his work.

    I’ll stop there for now, with the expectation that there might be questions about what I have just posted.

  7. Neil Rickert,

    “Suboptimal is fine…”
    I think that’s an important point. The IDists seem to assume that the molecules we find in organisms today are a) somehow fully “optimised”, and b) have the only structure that will do whatever it is they do.
    Neither of these things is true much of the time, and the fact that neither is true defenestrates all those silly calculations involving humungous improbabilities of a particular 300-aa protein (or nucleotide sequence)having to appear from a an amino-acid (or nucleotide) soup before evolution can happen.
    Weasels rule! – but so do wesels, weseals, and weasals

  8. Neil Rickert: Explain which part?

    I’ll assume you are asking about my comments on “search.”I’m not a biologist, so I’m not exactly sure how the term “search” is being used.Part of what I disagree with about neo-Darwinism, is the idea that it is based on optimizing fitness.There’s no need to optimize.Suboptimal is fine, as long as the replacement rate is sufficient to maintain a population.

    Here’s some background.I have been studying learning.I started with an interest in human learning, though I don’t restrict myself to humans.Having developed a theory of learning, I tried to apply it to evolution, treated as a learning system.And it seemed to fit rather well.

    If I look at learning theories in AI (artificial intelligence), they are mostly based on search.There are lots of research papers on machine learning.But it doesn’t work very well at all.In particular, it isn’t anything like human learning.That’s partly why I am dubious about search.

    My ideas on evolution are possibly in a similar direction to those of James Shapiro (at U. of Chicago), though I base this on limited knowledge of his work.

    I’ll stop there for now, with the expectation that there might be questions about what I have just posted.

    Aha!

    Me too. It’s why I’m interested in Intelligent Design (and why I was originally banned from UD!) – Darwinian evolution is a learning algorithm (hence Edelman’s term “neural Darwinism”), and has its counterparts in neural systems.

    That’s why I get a bit annoyed at some of the counter-arguments to ID. I do think that the products of learning (or, if you like, systems of deeply nested contingencies) have a characteristic signature, and sometimes it seems like the ID proponents, going “look! look! look! you blind Darwinists! Obviously this isn’t chance!!!!” have a point, and it’s not a point that can be simply rebutted by shrugging and saying, oh, but chance and material processes can do that stuff.

    Yes, of course they can (IMO), but only by being really quite brain-like, and intelligence-like.

    In other words, the ID proponents are pointing out a very real property of biological objects and the anti-ID ists are often (IMO) massively missing the point of their objection.

  9. Elizabeth,

    Do you mean that the products of a (biological/evolutionary) learning process have a characteristic signature when viewed as a body of evidence; or that any individual product of such a process has that characteristic signature when examined alone? If the latter, what is that signature?

    I think that IDists do tend to look at an individual case and say “Look! Look how complex it is! How can that possibly have been NOT designed?” – by which THEY mean that some external agency made it so.

    They absolutely refuse to admit the possibility of any plausible process by which a complex molecule, system, or organism may have evolved without the participation of an “active agent”

  10. damitall:
    Elizabeth,

    Do you mean that the products of a (biological/evolutionary) learning process have a characteristic signature when viewed as a body of evidence; or that any individual product of such a process has that characteristic signature when examined alone? If the latter, what is that signature?

    I think that IDists do tend to look at an individual case and say“Look! Look how complex it is! How can that possibly have been NOT designed?” – by which THEY mean that some external agency made it so.

    They absolutely refuse to admitthe possibility of any plausible process by which a complex molecule, system, or organism may have evolved without the participation of an “active agent”

    Well, I think that the entire system of biological evolution in some ways resembles the development of a neural system, from a simple beginning (single cell, in both cases! But that’s just neat, not important) to a vast, self-regulating, homeostatic, plastic, adaptive entity of which we are a somewhat maladaptive part!

    Not meaning to go all Gaia-ish (well, maybe a little) – I’m deadly serious. What I think the system lacks (and we, though mere parts of it possess) is “forward modeling”, which means it lacks the ability to intend anything. It’s a a vegetable intelligence of extremely high order, not an animal one, even though it contains animals.

  11. A GA does not create its solutions; it merely explores the space of possible solutions inherent in its rule set. At every stage you can point to the changes and rules that led to those changes. Different runs may reveal different parts of the solution space because of the random factors introduced into its selections at each operational step.

    The theory of the Darwinian evolution of life on earth proposes, like a GA, that at every step of evolution we could, in theory, write down the current state and previous state of the genetic information and factors, and trace the changes between the two in terms of some form of mutation. In effect, through the actions of the environment, the mutation of genetic information, and the effects of natural selection (and whatever other processes might apply), life traces a path through the “solution space” made possible within our universe.

    The conjecture of ID is that at critical points in the tree of life as it has evolved, for instance at its very start, and perhaps at other places, there are no step-by-step intermediate “solutions”, e.g. viable living states. The best examples would be those systems described as irreducibly complex; the production of most proteins, or origin at ATP, or the genetic code, along with its trascription elements.

    GAs, by their design, have the wonderful property of searching nicely connected solution spaces, and it is for that reason, not any other, that I find them an inadequate model for the evolution of life on earth.

    Here is a better challenge for a GA that I would say is much more like the problem of life’s evolution. Begin with any English sentence, no more than say, 20 words long. Create a list of possible mutations; these could involve changes in letters, duplications, rearrangements, and deletions, even wholesale imports from other GAs running the same model with different starting points and development paths. Mutate until you produce a chapter of text of 1000 words or more. The order of mutations must of course be done without intelligent intervention, but can be ordered based on their own rule set, as long as such rules do not “smuggle in” extra information (such as resort to an English dictionary or a library of books).

    The requirement for “survival” is that at every stage the text is recognizable as English speach, and conveys some coherent idea, no matter how nonsensical.

    The English need not be perfect; spelling errors are allowed as long as you can still recognize 95% of the words.

    See how far you get with that. My expectation is that, unlike the usual GAs, this solution space is highly disconnected, and “progress” will quickly bog down as the “mistakes” of mutation quickly overwhelm the appearance of new words and meanings.

    I may be away for the next two days, so I’ll try to respond as well as I can in the meantime.

  12. damitall:

    They absolutely refuse to admit the possibility of any plausible process by which a complex molecule, system, or organism may have evolved without the participation of an “active agent”

    I would be happy to admit the possibility that complex molecules can evolve without the input of an active agent. I just haven’t seen one yet that has any particular relevance to the origin and evolution of life on earth. Perhaps you can give an example, and we can discuss it.

  13. Thank you SCheesman and welcome!

    I have a question for you: if GA’s do not “create” a solution, merely find them in a solution space, how would the “creation” of a solution differ from that?

    For example, how would getting an GA to come up with a novel antenna design differ from someone creating one?

  14. Hello Elizabeth

    I am not familiar with all the “ins” and “outs” of antennae design through GAs (and I do understand such methods are highly succesful), but I do write numerical analysis software, and do know that such problems provide excellent rules for combining elements, variations of angles and lengths, as well as all the mathematics for calculating radiation patterns etc. The number of possible solutions is of course so large that it is impractical, even impossible for an engineer to investigate even a small number of them. He does, however, know the ingredients and parameters necessary to create an antennae in general. Computers are excellent at varying the parameters, finding local minima in the penalty functions. And good GAs are adept at “jumping” out in one or more dimensions when they become “stuck” to explore other portions of solution space.

    The creative process, however, looks down at the problem as a whole; it can break it up into portions and come up with solutions to inidividual parts, and combine them in ways that a process without foresight could not, for instance creating a part and keeping it ready until it is required, or drilling a hole in part #4 that you know that part #12 will need after parts 5-11 have been added. It is able to step far, far away from a local minimum, and then, only when everything else if finally completed, home in on the final solution. That’s a few of the ways that a creative solution is different.

  15. Elizabeth: What I think the system lacks (and we, though mere parts of it possess) is “forward modeling”, which means it lacks the ability to intend anything.

    Yes, that seems about right.

    Let me compare with human planning ahead. I can plan ahead for a partcular career by seeking the appropriate education. I can also plan ahead for unexpected contingencies by having a savings account or rainy day fund. I see biological systems as able to do the second of those (in some sense), but not the first. That is, they can keep up some genetic variation in the population to deal with unanticipated contingencies that change the environment in which they must survive. But they cannot prepare for specific futures.

  16. Elizabeth,

    I might add, antennae GAs are a good example of a “well-connected” solution space. Put a current through virtually any possible configuration of wire, and it will radiate! Change any little thing, (subject to keeping it all electrically continuous, which is one of the rules) and it will continue to radiate, a bit more or less, a bit different pattern.

  17. SCheesman,

    SCheesman:
    damitall:

    I would be happy to admit the possibility that complex molecules can evolve without the input of an active agent. I just haven’t seen one yet that has any particular relevance to the origin and evolution of life on earth. Perhaps you can give an example, and we can discuss it.

    I think DNA probably evolved. Is that a suitable example to discuss? Sufficient relevance to to origin and evolution life on earth?

  18. damitall,

    I think DNA probably evolved. Is that a suitable example to discuss? Sufficient relevance to the origin and evolution life on earth?

    Yes, excellent, that would be an excellent example. How does DNA, and the genetic code it embodies, along with the transcription machinery it requires, evolve from basic elements? I will grant, that, once it exists, it has the ability to evolve.

  19. SCheesman,

    The problem with this challenge is that it the results fail to provide any reason for preferring your English sentence task as an analog of the phylogenetic distribution of complex characters over one that clearly does entail a connected solution space. Therefore the result provides no guidance vis whether the task you specify is “more like evolution.” GAs demonstrate that he was right.

    One may, for example, one devise a task that is inherently untraversable by a GA. We set GAs about the task of solving the problem and they fail to do so. What have we learned about the appropriateness of the task as an analog of evolution? Nothing.

    Alternatively, we construct a task that is traversable – for example, antenna design. GA’s traverse it. What have we learned about the appropriateness of that task as an analog of evolution? Again, nothing.

    In the instance of the task you specify, the task may be of the first kind, or it may be of the second. GAs can tell us which. But what they disclose tells us nothing about whether or not that task is a better analog of the evolutionary problem space.

    What GAs do show us is that under many circumstances selectionist causation can attain solutions that look to all the world “designed,” yet are not. That was the heart of Darwin’s incredible insight: the appearance of design, regardless of how compelling, does not necessarily reflect design.

  20. damitall: SCheesman, I think DNA probably evolved. Is that a suitable example to discuss? Sufficient relevance to to origin and evolution life on earth?

    One more point. Take a strand of DNA and put it in a test tube. It does not evolve, except perhaps to degrade over time. In fact, it is just one (albeit an important part) of a highly complex system which can evolve.

    So the challenge of coming up with a complex molecule that can undergo any type of evolution relevant to the origin of life is, almost certainly, linked to the development of the system and environment in which that evolution is possible. I don’t think that’s controversial. I just think it’s highly improbable. Of course we disagree on the number of zeros in the denominator.

    Sorry, am off for a few days…

  21. Reciprocating Bill,

    Hello Bill

    I agree with pretty well everything you said. My suggestion of the “English GA” was more to try to provide an example of the type of “solution space” that ID would assert that the universe provides, as opposed to the type it is assumed to have by Darwinian evolution.

    Substantive work to determine what that space is like would be, like Douglas Axe, to try to come up with the frequency of viable proteins given the possible proteins based on the ordering of the constituent amino acids, or observing, as did Michael Behe, the frequency of beneficial mutations in response to malaria.

  22. SCheesman,

    Well, let me know when you’re back…. I think we can cope with moving the goalposts from a life-and-origins -relevant molecule to a complete replicating system. Incidentally, when you say you think something is “highly improbable”, I hope you’re going to back that up with some calculations of the odds.

  23. Regarding Gil Dodgen and simulations (only meanginful if you toss your computer out of the plane 🙂 ), I recall something similar cropping up at Francis Collins’ Biologos site in its early days before the more fundy fundies were weeded out or gave up. Lenski’s E. coli experiments were denigrated by some for being “artificial” and not natural selection.

    Professor Lenski replied in a clarifying e mail:

    I would describe selection in the context of my long-term E. coli experiment as “natural selection in an artificial environment” for the following reasons:

    It is an artificial environment that was constructed to be simple, reproducible, and tractable. It allow us to address basic questions about the process of evolution, rather than to understand how this particular species makes its living in nature.

    It is natural selection because we do not choose which individual cells survive and reproduce based on their properties. Rather, the interaction between the organisms’ phenotypes and their environment causes some genotypes to be more successful, on average, than others.

    (“Genetic drift” – that is, the effects of random sampling of cells during the serial transfers – also influences the fates of genotypes. Selection, though, acts through the systematic tendencies of different genotypes and their resulting phenotypes toward greater or lesser success.)

    Best wishes,
    Richard Lenski

  24. “He does, however, know the ingredients and parameters necessary to create an antennae in general.”

    In the simplest form the known ingredients are ‘a bit of metal’ the parameters are ‘the shape of the bit of metal’ and then there is a fitness function ‘signal strength’

    The process then searches the solution space. The creative process is indeed different as you described, but the solutions come from the same space of possibilities. Some solutions are easier to come by with one process as compared to the other. Wheels are an interesting example – they don’t really crop up in Biology and I would hypothesise that it relates to the way biological organisms grow – how do you feed nutrients and oxygen from one side of a free rotating joint to another? Sounds like a genuinely irreducibly complex piece of engineering, and very hard for evolution to generate a solution to in an incremental fashion.

  25. The oddest objection I find is the objection that the Designer designs the fitness function. If the ID of ID is the fitness function, then ID is the environment! Because that’s the analog surely.

    Cynic that I am, I think that objection has more to do with trying to avoid having to deal with GA’s as evidence that evolution can do certain things. GA’s can be dismissed as just an example of ID and not relevant to biological evolution by demanding that only a GA that wasn’t designed in any way would count as evidence – of to put it another way, the model is only good if it was not designed.

  26. damitall,

    Incidentally, when you say you think something is “highly improbable”, I hope you’re going to back that up with some calculations of the odds.

    If I may, I’ll back up one step and try to address the issue of probability as it applies to the “ID” and the “Darwinian” sorts of solutions spaces discussed above. In the “ID” conception, viable “solutions” (e.g. proteins able to accomplish a specific function) are extremely rare in comparison to the number of possible combinations of amino acids for a given length – just as for an English sentence of N letters the number of readable and meaninful sentences is a tiny fraction of the possible combinations of 26 letters (plus punctuation marks and space) possible. Such solutions are isolated and extremely local – it doesn’t take many errors to destroy the meaning, even if we can, to a limited extent, recognize misspelled words or garbled grammar and syntax. They are needle-like “maxima” in an other wise featureless flat solution space. Over most of the space you cannot find any gradient of fitness to follow to a viable solution — it’s just gibberish, like a protein with zero activity for the required action. The search for a good solution then becomes no better than a random search among the possible solutions, and it makes sense (as Douglas Axe has) to attempt to calculate odds for the production of a protein that involves a fairly simple-minded calculation of the possible orders of amino acid, in comparison to the number of proteins that can do the job required. It reduces to basically a blind search if you are relying on mutations in the DNA at the “letter” level to produce a viable “sentence” to do the job.

    However, in a Darwinian-style GA, at a substantial fraction of the locations in solution space (substantial enough, that is to make this process viable), some gradient in the fitness function can be located, and the complexity of the protein becomes largely irrelevant. Given any such starting point, you lock in the path toward increasing fitness, and it is only a matter of time before you will find a functional protein.

    I will say right now, I am not competent to provide any calculation of the odds, but I do think there are some real, experimental and observed situations where we can measure real numbers, and real mutations and see their effect, such what Michael Behe has done with malaria. Despite criticisms of “The Edge of Evolution”, I have yet to see any cogent arguments to contradict his findings in regards to malaria. Perhaps that case is not truly relevant to the evolution of new proteins, but at least it is an attempt to put some actual calculations to real, observed cases of mutations that lead to required new solutions. I see a real reluctance to support any attempt that would try to limit (or at least recognize that their might be limits to) the “creative” powers of evolution.

  27. To summarize, I would assert the following:

    GAs work very well in problems where variations in the input parameters of interest are continuous, and produce continuous, well-connected solutions where local gradients exist in nearly all cases, and local fitness maxima are not an obstacle to obtaining a global optimum because “jumps” (using, for instance Monte Carlo-type parameters space searches) stand a substantial chance of locating alternate and better solutions. Hence the success of GAs to solve things like antennae design problems.

    But Darwinian evolutionary creativity relies on mutations in DNA, a symbolic language and code, where inputs are highly discretized (limited to 20 amino acids)and outputs cannot be related through any sort of continuous-variable, and where “successful” outputs (e.g. proteins) represent needle-like islands in a sea of non-functionality. In such a problem, a GA is no more successful in finding a novel solution than a random search because it can no longer “follow the trail”. This is why GAs will never be employed by clever authors to produce better literature.

    This does not mean that in limited cases evolution does not behave like a GA. But I expect then we are looking at epigenetic effects, where the degree of methylation might alter in a fairly continuous manner the expression of a certain gene in response to an environmental condition. Darwin’s finches’ beaks?

  28. damitall:

    Well, let me know when you’re back…. I think we can cope with moving the goalposts from a life-and-origins -relevant molecule to a complete replicating system. Incidentally, when you say you think something is “highly improbable”, I hope you’re going to back that up with some calculations of the odds.

    Well, again, I must apologize for not providing calculations. That’s something I just can’t do with any confidence. It wouldn’t really help or hurt my arguments (I hope) in any case.

    Am I correct in wondering if we are trying to determine if (or that you are suggesting that) a GA could, in theory, “design” a self-replicating molecule?

    My answer to that question would be “no”; and that the reason is that the properties which make GAs successful are not to be found in the chemistry of biological reproductive systems, at least one based on a symbolic code such as DNA.

    So, perhaps a GA could produce a self-replicating molecule, but i think its form or chemistry would be something quite different from what we observe in life. And I think that be a really cool thing if we did!

  29. SCheeseman: I can see several issues you might want to consider before you pronounce that protein spaces “represent needle-like islands in a sea of non-functionality”.
    Firstly, and in general, visualizing the fitness function as a 2-dimensional surface embedded in a 3-dimensional space, when in fact protein evolution has many,many more than 2 degrees of freedom, leads to oversimplification of the complexity of the fitness function. Consider even a few more dimensions and, the fitness “surface” starts to look more like foam, with wormholes and tunnels connecting seemingly distant points in the fitness space.
    As an analogy, consider what a 2-dimensional cross-section of a bush would look like: lots of isolated “needle” islands of negligible area ( the cross sections of the twigs) embedded in a much larger area with seemingly no connections between the islands. Yet when observed with just one more dimension, we see that all the islands are connected with each other through a branching tree-like structure embedded in the higher dimension.
    Now this is just an analogy, and overly simple, but now try to extend this analogy to the even higher dimensionality represented by all the degrees of freedom available in protein evolution, e.g. transitions between shorter and longer aa chains, addition of branching side chains, polar-vs non-polar substitutions, point mutations that alter or preserve folds, alter or preserve functional pockets, frame shifts, recombination, reversals, etc, etc. etc., (none of which were considered in Axes’ study) and the simplistic low-dimensional visualization tool becomes woefully inadequate. What at first appear to be isolated islands may in fact be interconnected through an enormously, richly, more complex structure that defies visualization.
    I have other issues regarding the sensitivity of fitness to point mutations, and inappropriateness of using blind search or “needle-in-a-haystack” approaches for estimating probabilities in either protein or flagellar evolution, but I’ll leave those for another time.

  30. SCheesman,

    Axe published the following:
    Proc. Natl. Acad. Sci. USA
    Vol. 93, pp. 5590-5594, May 1996
    Active barnase variants with completely random
    hydrophobic cores

    “ABSTRACT The central structural feature of natural
    proteins is a tightly packed and highly ordered hydrophobic
    core. If some measure of exquisite, native-like core packing is
    necessary for enzymatic function, this would constitute a
    significant obstacle to the development of novel enzymes,
    either by design or by natural or experimental evolution. To
    test the minimum requirements for a core to provide sufficient
    structural integrity for enzymatic activity, we have produced
    mutants of the ribonuclease barnase in which 12 of the 13 core
    residues have together been randomly replaced by hydrophobic alternatives. Using a sensitive biological screen, we find
    that a strikingly high proportion of these mutants (23%)
    retain enzymatic activity in vivo. Further substitution at the
    13th core position shows that a similar proportion of completely random hydrophobic cores supports enzyme function.
    Of the active mutants produced, several have no wild-type core
    residues. These results imply that hydrophobicity is nearly a
    sufficient criterion for the construction of a functional core
    and, in conjunction with previous studies, that refinement of
    a crudely functional core entails more stringent sequence
    constraints than does the initial attainment of crude core
    function. Since attainment of crude function is the critical
    initial step in evolutionary innovation, the relatively scant
    requirements contributed by the hydrophobic core would
    greatly reduce the initial hurdle on the evolutionary pathway
    to novel enzymes. Similarly, experimental development of
    novel functional proteins might be simplified by limiting core
    design to mere specification of hydrophobicity and using
    iterative mutation-selection to optimize core structure.”

    This is just one of the papers in the literature that indicate that it is naive to think (as many seem to think) that the structure and sequence of a particular protein we see today in a particular organism is anything like the only one to permit its particular function.

    Less a needle on a plain, more a range of low hills with gentle slopes

  31. damitall,

    This is just one of the papers in the literature that indicate that it is naive to think (as many seem to think) that the structure and sequence of a particular protein we see today in a particular organism is anything like the only one to permit its particular function.

    I’ve actually tried to allow for this type of redundancy in the substitution of amino acids in my English-language illustration. An interesting report published in the last couple of years related how the general meaning of a sentence can still be understood even when a substantial number of the letters are replaced at random. Though I don’t want to push the analogy too far, the letters most tolerant of substitution are in the interiors of words, because when we read we tend to look to the starting and ending letters to determine meaning.

    The requirement for having specific amino acids is certainly variable. Some are quite critical, whereas others can sustain a fair amount of variability. The probability of getting a “working” protein for a given task is NOT as high as 1 in 20^N, where N = length of the protein. Still, I think our experience in studying actual mutations and genetic disease shows that the allowable variation can be rather limited at other locations. But even if we grant that we can substitute any amino acid at half the locations and still get some activity, we only get a square root improvement in the odds of locating a viable combination in a blind search; in other words instead of 1 in 20^N you have 1 in 20^(N/2). For N=200, that’s still pretty small.

    What it does mean, however, is that once you have a solution it is somewhat tolerant of small errors. I don’t think it will get you anywhere close to the smooth, continuous landscape necessary to produce a need function from an unrelated protein (e.g. through gene duplication and mutation of the duplicate to something new).

  32. SCheesman: I’ve actually tried to allow for this type of redundancy in the substitution of amino acids in my English-language illustration. An interesting report published in the last couple of years relayed how the general meaning of a sentence can still be understood, even when a substantial number of the letters are replaced at random.

    That isn’t really a good analogy. When you talk of letters and meaning, you have a specific meaning that you want to encode. However, biological organisms might become more fit by improving efficiency at digesting grass. Alternatively, they might gain the ability to digest alphalpha. There are many possible alternatives. So it isn’t just redundancy in ways of solving a particular problem. Rather, there are many problems such that it might suffice to solve any one of them.

  33. Hello Neil

    Actually, I think the analogy is pretty good. Your assertion that “there are many problems such that it might suffice to solve any one of them” is more speculation than fact. I’m no expert, but of all the different proteins we know of, how many are multi-functional, serving different functions?

    Think of the cap proteins used in constructing the flagellum. They require a particular shape (star), with particular charges to anchor them in the lengthening flagellum, and their undersides must be the proper dimension to allow rocking in the open tube while allowing the passage of the flagellar proteins as they let one by at a time.

    Now maybe that cap could be used for something else, but its not. And maybe you could find a variation on that cap that would do the job nearly as well, or better, but both of those are tiny numbers to balance the near-infinitude of forms which don’t do the job at all.

    There are lots of ways of putting words together into a sentence, but when it comes to expressing a necessary idea, only a tiny subset of all possible orderings will do the job, and even tiny changes in order can reverse or destroy the intended meaning. There may be variations possible due to synonyms (or spelling variations), but that does not change the overall balance of meaning vs meaninglessness.

    Hmm, this analogy gets better and better, I say.

  34. So let’s go back to the original post’s question: why people who are skeptical about Darwinian evolution aren’t persuaded by the power of evolutionary algorithms to find creative solutions

    As simply as I can put it, that is because those evolutionary algorithms do not describe the evolution of life.

    GAs as they are used and successfully employed deal with continuous variables. The “forward model” is highly linear and differentiable.

    The evolution of DNA involves the mutation of a code. The variables cannot be ordered any more than you can order the pieces in a box of chocolates or the tools in a toolbox; there are no lower and upper bounds. You get only the “Monte-Carlo” part of the search. The forward model is frequently highly non-linear; changing one amino acid (the minimum “step”) can completely alter the folding and actitvity of the entire protein. You cannot, even in principle come up with the equivalent of the Jacobian matrix used to invert multi-dimensional data.

    The last difference, as I’ve tried to explain above, is the solution space. If the space is not continuous, and you are forced to do Monte-Carlo type searches, the number of viable, useful solutions had better be large enough in comparison to the non-viable ones to give the search a chance a chance of finding one with non-zero fitness, and it then must be connected enough to lead you to one with sufficient fitness for the purposes required.

  35. Sledgehammer,

    Firstly, and in general, visualizing the fitness function as a 2-dimensional surface embedded in a 3-dimensional space, when in fact protein evolution has many,many more than 2 degrees of freedom, leads to oversimplification of the complexity of the fitness function. Consider even a few more dimensions and, the fitness “surface” starts to look more like foam, with wormholes and tunnels connecting seemingly distant points in the fitness space.

    Hello Sledgehammer

    Your point is well taken. When I speak of the “needle”, I am considering the fitness function as plotted perpendicular to the parameter space, in the same way that for a single variable X the fitness is plotted on Y, or for two variables X and Y the fitness is plotted as Z. For the purposes of my analogy, all the N-parameter space is flattened to an N-dimensional surface with the fitness of the viable solutions appearing as a needle, or needles in the sea of non-solutions. So for that we are looking parallel to the surface.

    But if you then look straight down at the top of the needle(s) you might find the “foam” that you are talking about, with connections and filaments linking them. This is quite analogous to small changes in lettering or use of synonyms preserving the original meaning of a sentence.

    However, the requirement that a substantial fraction of the parameters are fixed isolates the solutions into needle-like foamy areas — the needle parts correspond to fixed parameters, the foam to those that can vary. The chance of locating the fixed parameters still makes this problem intractable for a blind search (at least that’s my assertion!), even as other parameters could be connected.

  36. SCheesman: Think of the cap proteins used in constructing the flagellum.

    You are already assuming that the goal is invent a flagellum. But maybe the “goal” is merely to survive as a breeding population, and inventing a flagellum is only one of many possible changes that could enhance survival.

  37. SCheesman,

    …or needles in the sea of non-solutions.

    But nobody knows what quantity of (taking as a real example) theoretically possible protein sequences may be a solution (functional protein) in some situation (association with other proteins in a cell).

  38. Alan Fox,

    But nobody knows what quantity of (taking as a real example) theoretically possible protein sequences may be a solution (functional protein) in some situation (association with other proteins in a cell).

    Correct. We can at present really only investigate how much you can alter an existing protein to see how mutatable it is before it loses its function. Maybe in the future we will be able to do that, and we can actually decide between the options with some certainty. The car analogy might be a good one here. Given the same basic problem, design a way to transport several people from A to B. You get lots of variations, but some features seem to be pretty well indispensable or optimal (e.g. wheels). I wish I knew the field a bit better… how many proteins have been discovered? How many functions? Are there very different proteins doing the same things? If so, what percentage of the total? That sort of info might be helpful.

  39. Neil Rickert,

    You are already assuming that the goal is invent a flagellum. But maybe the “goal” is merely to survive as a breeding population, and inventing a flagellum is only one of many possible changes that could enhance survival.

    I didn’t think I was impying that; I was really only interested in discussing this specific function and whether it could be performed by widely varying forms of proteins. My only point was, that due to its specificity, apparently not. If it could, that might make it easier to evolve in the first place. Complicating this is the fact that, as it stands, this protein is integrated into an entire system, including signals to activate it at the proper time so it performs its task as required. But that’s another issue.

  40. There’s a nice article on OOL in New Scientist today. Not available to non subscribers unfortunately.

  41. Thanks for the response, scheeseman.

    Am I understanding your description of the fitness dimension as being orthogonal to all the others, as in the mathematical meaning of orthogonality in a vector space, i.e. any given vector projects onto all other dimensions as a point? Your “needle” language seems to imply this:

    SCheesman: For the purposes of my analogy, all the N-parameter space is flattened to an N-dimensional surface with the fitness of the viable solutions appearing as a needle, or needles in the sea of non-solutions. So for that we are looking parallel to the surface.

    But if you then look straight down at the top of the needle(s) you might find the “foam” that you are talking about, with connections and filaments linking them.

    So I’m confused because the last quoted sentence seems to acknowledge paths in parameter space linking the functional “needle” islands, representing proteins with different functions.

    So am I to understand that when you use the “needle” metaphor, you are suggesting that there are no other paths through the parameter space (degrees of freedom) connecting the point representing a functional protein to any other protein with a similar function, and it is truly an isolated island of functionality, or do you allow for other points of varying fitness within the same “needle”?

    If it is the former that you are arguing, I can not only point out that this is unlikely to be the case for most biologically important proteins, and point to damitall’s excerpt from Axe, or your own statement:

    SCheesman: The requirement for having specific amino acids is certainly variable. Some are quite critical, whereas others can sustain a fair amount of variability. The probability of getting a “working” protein for a given task is NOT as high as 1 in 20^N, where N = length of the protein.

    It seems to me that these all describe paths in just one of the available parameters (aa substitution) that connect points of varying fitness all within the same functional island and therefore the probability would be greater than 20^-N in a blind search, and with more degrees of freedom, possibly much greater.

    I can think of very few biologists that seriously suggest the protein evolution follow such a blind search, and most don’t even consider evolution to be a search, in the sense that there is a specific target function that evolution is trying to find. It’s more like a shotgun approach: launch loads of pellets in every which direction, and keep the ones that hit functional targets, any function, and then start from there with another sortie.

    Another thing that puzzles me :

    SCheesman: However, the requirement that a substantial fraction of the parameters are fixed isolates the solutions into needle-like foamy areas — the needle parts correspond to fixed parameters, the foam to those that can vary. The chance of locating the fixed parameters still makes this problem intractable for a blind search (at least that’s my assertion!), even as other parameters could be connected.

    Please explain the “requirement that a substantial fraction of the parameters are fixed”. Where does this requirement come from?

    My understanding is that with with more degrees of freedom (such as starting with a shorter chain, a homologue, or even a structurally similar protein ) the chances of finding an incremental path between proteins with different functions increases exponentially.

    I also fail to appreciate the distinction you made earlier between continuous fitness landscape and one that has incremental steps. My view is that the important distinction is whether a path or paths exist linking points on the fitness function, not whether they are discrete or continuous, and clearly aa substitution, is fundamentally discrete, as are all genetic manipulations, for that matter.

    Another argument often made (not implying that you are making this) is that any path linking functional proteins must necessarily entail functionality at every stage, otherwise, the argument goes, negative selection would weed out the non-functional intermediates. Yet if one considers all the ways known genetic processes can re-arrange nucleotides, the requirement that the starting sequence be functional need not be satisfied.

    Consider the vast resource represented by non-coding DNA (not “junk, just not translated into proteins) that could be potentially co-opted through splicing mutations.

    Although these scenarios might be mere possibilities (“just so stories?”), to not even consider them in any probability estimates would lead to the “tornado in a junkyard” scenarios so often used in arguments from improbability.

  42. SCheesman,

    The business of mutating a sentence can and has been done. In it simplest form it is the game of telephone, played by children. But it has also been done with written sentences.

    What is required is not a miracle, but a sufficiently powerful oracle. I’ve done it with individual words, and I’m pretty much an amateur at programming. Click my link.

    Many, if not mos, living organisms are genetically unique. Humans, on average, carry several unique, new alleles. Life tolerates a lot of slop in the specifications. Changes are not usually catastrophic.

    On reason they are not catastrophic is that deadly mutations are weeded out by overproduction. In animal reproduction, billions of sperm cells are produced for every one that successfully engages in reproduction. There is a fecundity at least equivalent to that of bacterial populations. That is why the argument from low birth rate (compared to bacteria) is specious.

  43. David Fogel (1997) would disagree that evolutionary computation is limited to working on simple fitness functions.

    Real-world function optimization problems often (1) impose nonlinear constraints, (2) require payoff functions that are not concerned with least-squared error, (3) involve nonstationary conditions, (4) incorporate noisy observations or random processing, or include other vagaries that do not conform well to the prerequisites of classic optimization techniques. The response surfaces posed in real-world problems are often multimodal, and gradient-based methods converge rapidly to local optima (or perhaps saddle points), which may yield insufficient performance. For simpler problems, where the response surface is, say, strongly convex, evolutionary algorithms do not perform as well as traditional optimization methods (Back [4]). But this is to be expected as these traditional techniques were designed to take advantage of the convex property of such surfaces. Schwefel [22] has shown in a series
    of empirical comparisons that in the obverse condition of applying classic methods to multimodal functions, evolutionary algorithms offer a significant advantage. In addition, in the often-encountered case of applying linear programming to problems with nonlinear constraints, this offers an almost certainly incorrect result because the assumptions required for the technique are violated. In contrast, evolutionary computation can directly incorporate arbitrary linear and nonlinear constraints (Michalewicz [5]).

    That same year, I challenged William Dembski to show where the specific information of good approximate solutions to the TSP found by GA came from. He has never taken that up, other than his on-the-spot dismissal that his “explanatory filter” logic was fine, empirical evidence to the contrary be damned. The TSP is an NP problem, which has the import that no one has yet discovered an efficient method to find solutions. Evolutionary computation is not limited to “easy” problems for its utility.

  44. Hello Petrushka

    I accept all you have stated above. Sequences have a fair amount of wiggle-room once located. Not all mutations are fatal. Bad errors are eliminated. The post, however, was asking why GAs were not accepted as a valid model for evolution of life, which involves locating entirely new solutions. Start your GA off with the initial condition of a working protein. Will it come up with a different one?

  45. I too always find it amazing that an IDer can say that the results of GAs are designed. When using this argument, they are basically using the same frontloading argument that they use for the products of evolution over time. “Oh, legs, those were frontloaded into the genetic code of some fish by the designer.”

    If you consider GAs and they always produce results more quickly that a better than designed systems from teams of engineers, then the whole front-loading thing is obviously silly.

    What ID proponents are saying is that a programmer can design a system that is better than a team of experts in the field (and in at least one case, that teams of experts in the field can’t even understand the resulting system). In everything from aerospace craft design, optics, scheduling, stock market analysis, etc… a few programmers are smarter in that field than all the experts in that field.

    This obviously makes no sense. I think I have an explanation that helps explain why GAs work. It’s here: http://ogremk5.wordpress.com/2011/08/08/why-genetic-algorithms-are-not-examples-of-intelligent-design/

    P.S. I have enjoyed the reports of your efforts on UD. I avoid the place myself. I’ll also add that the current tactics seem to be a common tactic for pro-ID people. It’s unfortunate, but thank you for showing them what losers they are.

  46. Hello Mr. Elsberry

    None of what I have stated previously excluded use of any of the 4 “enhancements” you have noted. GAs can be extermely clever in navigating fitness surfaces, no matter how complex. I readily accept, that once a surface is located, it is only a matter of time and computation until you find the optimum (or a near-optimum) solution to the problem at hand.

    My objection to using GAs as evidence for the viability of maco-evolution is that successful GAs work on continuous variables with locally differentiable surfaces. The evolution of life, however, requires the random mutation of a code in a space where individual solutions are separated by vast expanses of zero fitness (no surface texture at all), where it is impossible to extract directional information, and where the resources for blind search through a Monte-Carlo style parameter sweep are hopelessly inadequate.

  47. Elizabeth: I’m trying to think of a search space that wouldn’t be easily traversed by a Darwinian algorithm.

    Most of them.

    Chaotic landscapes are usually not traversable in any meaningful sense. On the other hand, the natural environment is highly ordered. For instance, if there is one richly-energized photon coming from a particular direction, there is a high likelihood of another coming from the same direction (sunlight).

Leave a Reply