My husband, mother, father, myself, and my four-year-old son were going out for a walk. It was raining. My son refused (as usual) to wear his raincoat. Instead, he carried a cup, which he held out in front of him. He argued that he was going to catch the rain drops in the cup so that by the time he got to the place the raindrops had been, they’d be in the cup and he’d be dry. Half an hour later, four adults were still standing around, drawing diagrams on the backs of envelopes, arguing about Pythagoras and trigonometry, all to no avail. We went out, with cup, sans rain coat. My son got wet. He insisted he remained dry.
I’ve got as far as Chapter 5 of Dembski’s book No Free Lunch, the chapter called Evolutionary Algorithms, and about which he says in his Preface: “This chapter is the climax of the book”. He claims that in it he shows that “An elementary combinatorial analysis shows that evolutionary algorithms can no more generate specified complexity than can five letters fill ten mailboxes.”
I think he’s making the same kind of error as my son made.
Many people, including people with far more mathematical expertise than I (including real mathematicians) have had a go at Dembski’s NFL arguments, but I’m going wade in anyway.
What I do like about Dembski’s writing is that he commits himself to clear operational definitions. He defines “evolutionary algorithms” as “any well-defined mathematical procedure that generates contingency via some chance process and then sifts it via some law-like process.” It’s a shame that he is so hyperfocussed on Dawkins’ Weasel, which is a highly atypical example of evolutionary algorithms, and differs from those postulated to be applicable to biology on many important ways, but I’ll start there anyway
The first confusion (apart from the latching issue,which I’ll ignore) that I think Dembski introduces is on page 188 in my edition, where he is describing “phase space”, the “space” of all possible strings of given length (28 in the case of Weasel, 500 in my own example) consisting of an alphabet of given size (26 letters +space in Weasel; Heads and Tails in mine):
If you think of the phase space as a giant (but not infinite) plane…
On the previous page, Dembski says:
The univalent measure defined with reference to theh pahse space an dneeding to be optimized is known as the fitness function (also fitness measure or fitness landscape).
implying that these three terms are interchangeable. This is important, because if “fitness landscape” means the same as “fitness function” it would imply a fitness topography that is independent of the topology of “phase space”.
On the topology of “phase space”, he writes
…(1) any point in phase space has zero distance from itself; (2) the distance between two points does not depend on the order in which one considers them (e.g. flying from Atlanta to Dallas is the same asdistance as flying from Dallas to Atlanta); and (3) the direct distance between two points is never bigger than the distance of going through some intermediate point.
However, if we are talking about every possible linear sequence of N characters drawn from an alphabet of M letters, there is more to the topology of phase space than that, and “plane” seems to me a misleading image. I’m going use my example, to explain what I mean, rather than Weasel, because it’s simpler (only two “letters” in its “alphabet”, namely Heads and Tails).
Let’s (imaginatively) plot every possible sequence of Heads and Tails in a 500 coin-toss series (i.e. 2500 possible sequences) on the surface of a globe. First we put 500 Heads at the North Pole and 500 Tails at the South Pole. Going to the South pole, and heading North, we will first of all meet a latitude in which all the sequences have 499 Tails and one Heads, and there will be 500 of these, each with the single Heads at a different position. Heading North again, we will next meet a latitude in which all the sequences have 488 Heads and two Tails, and so on. All sequences with 250 Heads and 250 Tails will be systematically arranged around the equator.
Now if we regard each sequence as being at a node on a graph, and connect with a line (an “edge”) all points that only differ at one position, we will find that at the sequence at the South Pole is connected to every node (500 sequences) in the latitude immediately to the north of it, and each of these nodes in turn are connected to 499 of the (500*499) nodes at the latitude one step more northerly still, and so on. This means that we can travel from every node to every other node by a series of “edges”. If we want to travel from the South Pole to the North Pole, we can do so by any one of a vast number of routes, the shortest of which will use steps that are always to a more northerly latitude. In mutation terms, this means that each all points in each latitude is a single point mutation away from many points on the next more northerly or southerly latitude.
We also use a “spring embedding” algorithm in order to arrange the nodes at each latitude in a way that physically minimises the length of each edge. This will make it easier to envisage the shortest routes between any pair of nodes.
So that is how I am envisaging the topology of phase space – as nodes on a globe all of which are connected by a single point mutation to at least 500 other nodes,
So what about the Fitness Function/Landscape? In Weasel this is dead easy, and the equivalent in my example would be simply specifying a target sequence and a starting location. Let’s say I choose as my target sequence, the sequence at the North Pole (500 Heads) and I make life as difficult for my Traveller as possible by starting at the South Pole (500 Tails). Now, for this target, my fitness function can simply be the sum of Heads (as in Weasel – the sum of letters in the correct position). But what of the fitness landscape? The topography?
If we imaginatively represent the sums of Heads (i.e. the fitness function) for each node as shades of grey, in which all-Tails is white, and all-Heads is black, then when we look at our globe, we will see a white South Pole, a black North Pole, and a mid grey equator, with all other latitudes smoothly graded in between. Or, if we liked, we could represent the fitness function as metres below sea level, and simply let our Traveller (let’s put her on a skate-board) roll straight downhill from South to North (hence the terms “landscape” and “topography”).
We could make it more difficult by making the target sequence some sequence on the equator, and the starting sequence some randomly chosen node anywhere on the globe. Now the fitness function cannot simply be “sums of heads”. This time we have to sum the number of positions in each sequence that are shared with the target sequence. Now the dark or deep nodes (high fitness) will be clustered around the target-sequence in a roundish dark patch extending north-south as well as east and west from the target. And again, there will be a smooth gradation of grey over the globe, and a smooth downhill run from any point on the globe towards the target.
And, as Dembski justifiably says, this is cheating. At one level it’s cheating because we’ve specified the target in the fitness function (the “information” has been “smuggled in”) as a specific sequence against which intermediate sequences are evaluated. At a more important level, it’s cheating because we’ve designed the fitness function so that when applied to the phase-space topology there is a straight down hill run to the target.
But what I want to show is that it is perfectly possible to a) specify the target in the fitness function, and NOT have the evolutionary algorithm find it, and b) not specifiy the target in the fitness function and STILL have the evolutionary algorithm find it. The issue, in other words, I would argue, is not whether you specify the target in the fitness function, but the topography of the fitness landscape, which is, in turn, a function of the topology of the phase space – the very aspect of its topology that Dembski ignores.
To take a) first. Let’s suppose that the target is some sequence with approximately equal ones and zeros (perhaps a Shakespearean phrase rendered in Morse, with runs-of-heads equal to sound and runs-of-tails equal to silence). Using our old phase space, this target will be a dark patch somewhere on the globe representing a low point to which all things tend to fall.
However, let’s change the mutation system. This time, instead of single point mutations, the only mutations allowed will be single change-of-place mutations (the value H/T at place i trades with the value H/T at place j). If we now reconnect our nodes that are one-mutation away from each other, and reapply the spring-embedding, we will have something quite different from the single-point-mutation plotting. Our poles will be completely disconnected from the rest of the globe and from each other, and worse, all latitudes will be disconnected from each other as well. Worse still, what was a nice focussed dark patch representing our Shakespearean phrase in Morse will be a scattered mess of dark spots. This means that even if, by chance, you start at the same latitude as the target, there won’t be a nice straight downhill ride towards it, but a roller-coaster. In other words, there is a very small probability that a series of mutations, even if selected for their similarity to the target (in terms of how many locations in the string have a value shared with the target) will actually get there. The target, despite being specifically encoded within the fitness function, is now Irreducibly Complex. That doesn’t mean it can’t be found, but the probability is far lower than it would be when situated within a smooth fitness landscape.
And this is why fitness landscape and fitness function are not, contra Dembski, the same thing.
To take the second scenario, b: this time the target is as in my exercise: it’s a subset of phase space, not a single target, but the subset has been chosen to be of a size that means that, according to Dembski’s paper: Specification: the Pattern that Signifies Intelligence, it has Specified Complexity (i.e. it’s a very small subset of the whole of phase space). In other words there are several nodes on the globe that are near-black (have the target fitness), and all other nodes are shades of grey. If we take our first mutation type (the one in which the edges connected of single point mutations), it turns out that the black target nodes are in northern temperate latitudes (more Heads than Tails), but rather more scattered as to longitude, although still clustered. However, although scattered, the whole cluster is fairly dark grey, and viewed from a distance, there is a definite east-west dark stripe with shades of grey elsewhere varying from white at the South Pole, as before, mid grey at the North Pole, at the Equator, and so on. So while the odds of the traveller rolling into the very blackest hole are fairly small, the odds of it rolling into a supra-threshold dark hole are really quite large.
Yet none of these holes are specified in the Fitness Function – what is specified is the properties that the sequence must have, not the sequence itself.
And we can increase the probability that the traveller will roll into a dark hole still further by adding mutation types (and thus more edges), or increasing (up to a point) the likely number of mutations in a single iteration. All these will increase the connectivity between nodes, and increase the probability that there will be a downhill path (via the edges) from the starting node to one of the target nodes.
And if Dembski objects that I have still “smuggled in” information about the target sequence by specifying the properties it must have, I plead guilty, but respond that: All I have “smuggled in” is the problem to be solved. I have no more “smuggled in” the solution(s) to the problem than an examiner “smuggles in” the answers on a math exam in the guise of the questions asked. The fact that at least one solution exists is neither here nor there. If I had “set” an impossible problem (one to which there is no solution), no algorithm is going to find a solution, clearly. The reason we use evolutionary algorithms practically is not so that we can print a phrase of Shakespeare we already know, but to find a solution we don’t know to a problem we want to solve. To gain information.
So Dembski, it seems to me, has conflated fitness function with fitness landscape, and, rather like my son, as a result, failed to note that the fitness landscape is in part a function of an aspect of phase space topology he hasn’t even considered (how the elements of phase space are connected in any given physical system). And, presumably as a result of that, has failed to note that while it may be true that averaged across all possible fitness landscapes (including those in which the dot-shades are uniformly scattered across the globe, giving a landscape rather like Bryce Canyon, pictured above, and of those, those in which all dots are either white or black, or all dots bar one are white – a uniform surface with a single deep well), evolutionary algorithms fare no better than “random search” (as the NFL theorems state), there are many naturally occurring fitness landscapes, including those in biology, where phase space has the kind of highly interconnected topology (many sequences one step apart) that will tend to make “problem space” (the fitness landscape) for many fitness functions relatively smooth, and thus the finding of a solution to a target problem fairly tractable for evolutionary algorithms.
Including solutions that have “specified complexity” of chi>1 and are therefore in Dembski’s rejection region for non-design.