The Law of Conservation of Information is defunct

About a year ago, Joe Felsenstein critiqued a seminar presentation by William Dembski, “Conservation of Information in Evolutionary Search.” He subsequently discussed Dembski’s primary source with me, and devised a brilliant response, unlike any that I had considered. This led to an article, due mostly to Felsenstein, though I contributed, at The Panda’s Thumb. Nine days after it appeared, Dembski was asked in a radio interview whether anyone was paying attention to his technical work. Surely a recipient of

qualifies as a someone. But Dembski changed the topic. And when the question came around again, he again changed the topic. Mind you, this isn’t how I know that Felsenstein blasted conservation of “information,” which is not information, in evolutionary “search,” which does not search. It’s how I know that Dembski knows.

Or, I should say, it’s how I first knew. The Discovery Institute has since employed Dembski’s junior coauthor, Winston Ewert, to quietly replace various claims, including the most sensational of them all (Dembski and Marks, “Life’s Conservation Law: Why Darwinian Evolution Cannot Create Biological Information,” 2010; preprint 2008):

Though not denying Darwinian evolution or even limiting its role in the history of life, the Law of Conservation of Information shows that Darwinian evolution is inherently teleological. Moreover, it shows that this teleology can be measured in precise information-theoretic terms.

Felsenstein realized that we could apply their measure to a simple model of evolution by natural selection, devoid of purpose, and obtain a large quantity. According to the model, evolution begins with a random genotype, and ends with a genotype fitter than all of its neighbors. The neighbors of a genotype are those that can arise from it by mutation in a single point. In each step of the evolutionary process, a genotype is replaced by the fittest of its neighboring genotypes. The overall result of evolution is a sequence of genotypes that is unconstrained in how it begins, and highly constrained in how it ends. Each genotype in the sequence is not only fitter than all of the genotypes that precede it, but also fitter than all of their neighbors. That is, evolution successively constrains the genotype to smaller and smaller subsets of the space of genotypes. The final genotype is at the very least fitter than all of its neighbors. Equivalently, the minimum degree of constraint is the neighborhood size. Dembski and Marks mistake this for the degree of teleology (purpose) in evolution, and refer to it as active information. The gist of “conservation of information” is that teleology comes only from teleology. As Dembski said in his seminar presentation:

If you will, the teleology of evolutionary search is to produce teleology.

Considering that the neighborhood size indicates only how many, not at all which, genotypes are eliminated in a single step of evolution, there can be no argument that constraint implies purpose.1 Ewert does not hazard an overt reply, but in fact responds by downgrading “active information” from a measure of teleology to a measure of bias. The new significance of “conservation of information” is this: if the constraint, er, bias of a natural process is not due to design, then nature itself must be constrained, er, biased.2 We have it from Ewert, writing on behalf of Dembski and Marks, that:

Of course, Darwinian evolution is not a teleological process and does not search for a goal [e.g., birds…] Whatever search or process might be in play, … it produces birds much more often than chance would otherwise lead us to predict. It is this bias towards producing a bird that we call active information. […] Having postulated Darwinian evolution, … the fact that birds exist has to be explained in terms of the initial configuration of the universe. The universe must have begun with a large amount of active information with respect to the target of birds.

Although “information” stumbles on, searching for brains to eat, the vital principle has departed from the Law of Conservation of Information (LCI). No more does LCI show what it shows. The credit for dispatching teleology goes entirely to Joe Felsenstein. You should have a look at his latest, “Why ID Advocates Downplay Our Disagreement With Them,” before watching me deliver a round to the frontal lobe of the Conservation of Information Theorem.

The credit for keeping things surreal goes entirely to the Discovery Institute. Replacing Dembski, a full-time senior fellow, with Ewert in an exchange with a renowned evolutionary geneticist is beyond bizarre. But it is perhaps no accident that the authorship of the response serves the same purpose as its rhetorical tactics, namely, to conceal the presence of huge concessions. What Ewert does, avoiding all signs of what he’s doing, is to undertake salvage of Dembski’s treatment of LCI in Being as Communion: A Metaphysics of Information (2014). Rather than identify a source, he speaks from authority. Rather than replace terms that convey precisely the misconceptions in the book, he explains matter-of-factly that they don’t mean what they seem to say. And rather than admit that Felsenstein and I set him and his colleagues straight on the meanings, Ewert proclaims that “These Critics of Intelligent Design Agree with Us More Than They Seem to Realize.” The way he trumps up agreement is to treat a single section of our article, which merely reiterates an old point regarding the smoothness of fitness landscapes, as though it were the whole. We actually focus on an arbitrary, and hence arbitrarily rough, landscape.

LCI, putatively a law of nature, putatively has a mathematical foundation. According to Being as Communion (p. 148):

A precise theoretical justification for the claim that natural selection is inherently teleological comes from certain recent mathematical results known as Conservation of Information (CoI) theorems.

Now the claim is that natural selection is inherently biased, and that something must account for the bias — either design or the initial “configuration” of the Universe (wink wink, nudge nudge) — given that bias is conserved. In short, CoI still applies, with the understanding that I is for bIas. Dembski places his work in the context of earlier analysis of search, and mentions a sometime theorist you’ve heard of before (p. 151):

Computer scientist Thomas English, in a 1996 paper, also used the term “Conservation of Information,” though synonymously with the then recently proved results by Wolpert and Macready about No Free Lunch (NFL). In English’s version of NFL, “the information an optimizer gains about unobserved values is ultimately due to its prior information of value distributions.”

I actually proved an NFL theorem more general than that of Wolpert and Macready, and used the term “conservation of information” to characterize an auxiliary theorem. Although I got the math right, what I wrote about it in plain language was embarrassingly wrong. I happened to emend my online copy of the paper a month before Dembski’s book appeared, adding a preface titled “Sampling Bias Is Not Information.” So, while it definitely was Felsenstein who left Dembski et al. no choice but to abandon teleology, it may be that I had some influence on their choice of a new position. In any case, it falls to me to explain why they are embarrassingly wrong in what they claim about math that they have gotten right.

The right approach, for a general readership, is to address only what is most obviously wrong, and to put as much as possible into pictures. We’ll be looking at broken sticks. We’ll even watch them breaking randomly to pieces. This is how Dembski et al. see the biases of an evolutionary process being determined, in the absence of design. CoI tells us something about the random length of a particular segment, selected before the stick breaks. But Felsenstein and I selected an outcome after modeling the evolutionary process. We targeted an outcome for which the bias was large. The bias was not large because we targeted the outcome. Even if we pretend that a broken stick determined the bias of the evolutionary process, CoI does not apply. The theorem that does apply has no name. It is the solution to Exercises 666-667 in a highly respected text of the 19th Century, Choice and Chance. Given that it bears the Number of the Beast, and comes from the Reverend William Allen Whitworth, I’m tempted to call it the Revelation Theorem. But I’ll avoid giving offense, and refer instead to the Broken Stick Theorem.

Breaking sticks

Dembski et al. believe that CoI applies to all physical events that scientists target for investigation. The gist of their error is easy to understand. A scientist is free to investigate any event whatsoever after observing what actually occurs in nature. But the CoI theorem assumes that a particular event is targeted prior to the existence of a process. This is appropriate when an engineer selects a process in order to generate a prespecified event, i.e., to solve a given problem. It is no coincidence that the peer-reviewed publications of Dembski et al. are all in the engineering literature. The assumption of the theorem does not hold when a scientist works in the opposite direction, investigating an event that tends to occur in a natural process. Put simply, there is a difference between selecting a process to suit a given target and selecting a target to suit a given process. The question, then, is just how big the difference is. How badly wrong is it to say that the CoI theorem characterizes conservation of bias in nature? Fortunately, the error can be conveyed accurately with pictures. What we shall see is not conservation, but instead unbounded growth, of the maximum bias (“active information”).


ETA: Text between the horizontal rules is an improved introduction to the technical material, developed in discussion here at TSZ. It comes verbatim from a comment posted a week ago. I’ve made clear all along my intent to respond to feedback, and improve the post. However, I won’t remove any of the original content, because that’s too easily spun into a retraction.

Dembski et al. represent natural processes abstractly. In their math, they reduce the evolutionary process to nothing but the chances of its possible outcomes. The CoI theorem is indifferent to what the possible outcomes actually are, in physical reality, and how the process actually works, that the outcomes should have the chances of occurrence that they do. Here I assume that there are only 6 possible outcomes, arbitrarily named 1, 2, 3, 4, 5, 6. The possible outcomes could be anything, and their names say nothing about what they really are. Each of the possible outcomes has a chance of occurrence that is no less than 0 (sure not to occur) and no greater than 1 (sure to occur). The chances of the possible outcomes are required to add up to 1.

As far as the CoI theorem is concerned, an evolutionary process is nothing but a list of chances that sum to 1. I’ll refer to the list of chances as the description of the process. The first chance in the description is associated with the possible outcome named 1, the second chance in the description is associated with the possible outcome named 2, and so forth. The list

    \[.1, \quad .3, \quad .1, \quad .2, \quad .1, \quad .2\]

is a valid description because each of the numbers is a valid chance, lying between 0 and 1, and because the total of the chances is 1. We can picture the description of the evolutionary process as a stick of length 1, broken into 6 pieces.

[Need a new figure here.]

Naming the segments 1, 2, 3, 4, 5, 6, from left to right, the length of each segment indicates the chance of the possible outcome with the corresponding name. Consequently, the depiction of the evolutionary process as a broken stick is equivalent to the description of the process as a list of the chances of its possible outcomes.

You perhaps wonder how I would depict the evolutionary process as a broken stick if a “possible” outcome had absolutely no chance of occurring. And the answer is that I could not. There is no segment of length 0. In the CoI theorem, however, chances precisely equal to 0 are effectively impossible. Thus it is not misleading to say that Dembski et al. reduce the evolutionary process to a broken stick.

There are infinitely many ways to break our metaphorical stick into a given number of segments. Averaging over all of them, the lengths of the segments are

    \[\frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}.\]

That is, in the average description of an evolutionary process, the possible outcomes are uniform in their chances of occurrence. Dembski et al. usually advocate taking uniform chances as the standard of comparison for all processes (though they allow for other standards in the CoI theorem). Dembski and Marks go much further in their metaphysics, claiming that there exist default chances of outcomes in physical reality, and that we can obtain knowledge of the default chances, and that deviation of chances from the defaults is itself a real and objectively measurable phenomenon. Although I want to limit myself to illustrating how they have gone wrong in application of CoI, I must remark that their speculation is empty, and comes nowhere close to providing a foundation for an alternative science. Otherwise, I would seem to allow that they might repair their arguments with something like the Broken Stick Theorem.

Taking uniform chance as the standard to which all evolutionary processes are compared, we naturally arrive at an alternative representation. We begin by writing the standard description a bit differently, multiplying each of the chances by 1.

    \[1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}.\]

Now we can write any description whatsoever by adjusting the multipliers, while leaving the fractions 1/6 just as they are. The trick is to multiply each of the chances in the description by 1, but with 1 written as 6 \times 1/6. For instance, the description

    \[\frac{1}{24}, \quad \frac{1}{3}, \quad \frac{1}{12}, \quad \frac{1}{4}, \quad \frac{1}{6}, \quad \frac{1}{8}\]

is equivalent to

    \[\frac{6}{24} \times \frac{1}{6}, \quad \frac{6}{3} \times \frac{1}{6}, \quad \frac{6}{12} \times \frac{1}{6}, \quad \frac{6}{4} \times \frac{1}{6}, \quad \frac{6}{6} \times \frac{1}{6}, \quad \frac{6}{8} \times \frac{1}{6}.\]

The multipliers

    \[\frac{6}{24}, \quad \frac{6}{3}, \quad \frac{6}{12}, \quad \frac{6}{4}, \quad \frac{6}{6}, \quad \frac{6}{8}\]

are the biases of the process, relative to the standard in which the chances are uniformly 1/6. The process is biased in favor of an outcome when the bias is greater than 1, and biased against an outcome when the bias is less than 1. For instance, the process is biased in favor of outcome 4 by a factor of 6/4 = 1.5, meaning that the chance of the outcome is 1.5 times as great as in the standard. Similarly, the process is biased against outcome 1 by a factor of 24/6 = 4, meaning that the chance of the outcome is 6/24 = 0.25 times as great as in the standard. The uniform standard is unbiased relative to itself, with all biases equal to 1.

The general rule for obtaining the biases of an evolutionary process, relative to the uniform standard, is to multiply the chances by the number of possible outcomes. With 6 possible outcomes, this is equivalent to scaling the the broken stick to a length of 6. We gain some clarity in discussion of CoI by referring to the biases, instead of the chances, of the evolutionary process. The process is metaphorically a broken stick, either way. Whether the segment lengths are biases or chances is just a matter of scale. We shall equate the length of the stick to the number of outcomes, and thus depict the biases of the process, for and against the possible outcomes corresponding to the segments.


To make the pictures clear, we assume that the evolutionary process has only 6 possible outcomes. Let’s name the possibilities 1, 2, 3, 4, 5, and 6. The process is unbiased if none of the possibilities has a greater chance of occurring than does any other, in which case the chance of each possible outcome is 1/6. According to Dembski et al., if we deny that the biases of the process are due to design, then we essentially say that a stick of length 6 broke randomly into 6 segments, and that the lengths of the segments determined the biases. Suppose that the length of the 3rd segment of the broken stick is 2. Then the evolutionary process is biased in favor of outcome 3 by a factor of 2. The chance of the outcome is

    \[2 \times \frac{1}{6} = \frac{1}{3}.\]

Suppose that the length of the 5th segment is 1/4. Then the process is biased against outcome 5 by a factor of 4, and the chance of the outcome is

    \[\frac{1}{4} \times \frac{1}{6} = \frac{1}{24}.\]

These biases are what Dembski et al. refer to as active information. The term, in and of itself, begs the question of whether something actively formed the process with bias in favor of a desired outcome.


ETA: Text between the horizontal rules comes from an earlier attempt at improving the introduction to the technical material, developed in discussion here at TSZ. I’ve quoted a comment posted 17 days ago.

Dembski et al. do not allow that such deviations from the supposedly “natural” default of uniform chance might be brute facts of physical reality. There must be a reason for bias. If we do not allow that bias is possibly due to design of the process to serve a purpose, then Dembski et al. force on us the view that bias itself arises by chance. (This is multifariously outrageous, but for reasons that are not clearly tied to their math.) That is, the chances of the possible outcomes of the evolutionary process are determined by an antecedent process, which is also random. Talk about the chances of chances gets very confusing, very fast. So I say instead that the evolutionary process is randomly biased by a process that occurs before it does. The biases of the evolutionary process are just the chances of the 6 possible outcomes of the evolutionary process, multiplied by 6. Setting the chances randomly is equivalent to setting the biases randomly.

The broken stick is a conventional metaphor for probabilities that are themselves set randomly. (I follow Dembski in reserving the word chance for the probability of a physically random outcome.) The random lengths of the segments of the stick are the probabilities. The stick is ordinarily of unit length, because the probabilities must sum to 1. To visualize random biases, instead of random chances, I need only multiply the length of the stick by the number of possible outcomes, 6, and randomly break the stick into 6 pieces. Then the biases sum to 6.

I stipulate that the biasing process, i.e., stick breaking, is uniform, meaning that all possible biases of the evolutionary process are equally likely to arise. A tricky point is that Dembski et al. allow for uniform biasing, but do not require it. The essential justification of my approach is that I need consider only something, not everything, that they allow in order to demonstrate that the theorem does not apply to scientific investigation. What I consider is in fact typical. The uniform biasing process is the average of all biasing processes. Thus there can be no objection to my choice of it.

Dembski et al. refer to all random processes as “searches.” The term is nothing but rhetorical assertion of the conclusion they want to draw. The stick-breaking “search” (process), which determines the biases of the evolutionary “search” (process), is a visualization of what they call a “search for a search.” Dembski et al. allow for the biasing process itself to be biased by an antecedent process, in which case there is a “search for a search for a search.” In Being as Communion, Dembski avoids committing to Big Bang cosmology, and indicates that the regress of searches for searches might go back forever in time. Fortunately, we need not enter a quasi-mystical quagmire to get at a glaring error in logic.


Animation 1. In the analysis of Dembski, Ewert, and Marks, the biases of an evolutionary process are like control knobs, either set by design, or set randomly by another process. The random biasing process is like a stick breaking into pieces. The biases of an evolutionary process are the lengths of the segments of a broken stick. Here the number of possible outcomes of the evolutionary process is 6, and a stick of length 6 breaks randomly into 6 segments. No segmentation is more likely than any other. Before the stick starts breaking, we expect any given segment to be of length 1. But when a scientist investigates an evolutionary process, the stick has already broken. The scientist may target the outcome for which the bias is greatest, i.e., the outcome corresponding to the longest segment of a broken stick. With 6 possible outcomes, the expected maximum bias is 2.45. Generalizing to n possible outcomes, the expected maximum bias of a randomly biased evolutionary process is a logarithmic function of n. The quantity grows without bound as the number of possible outcomes of evolution increases. The Conservation of Information Theorem of Dembski et al. tells us that the greater the bias in favor of an outcome specified in advance, the less likely the bias is to have arisen by breaking a stick, no matter how many the possible outcomes of the evolutionary process. It depends on an assumption that does not hold in scientific study of evolution.

In the most important case of CoI, all possible segmentations of the stick have equal chances of occurring. Although the segments almost surely turn out to be different in length, they are indistinguishable in their random lengths. That is, the chance that a segment will turn out to be a given length does not depend on which segment we consider. This is far from true, however, if the segment that we consider depends on what the lengths have turned out to be. Dembski et al. neglect the difference in the two circumstances when they treat their theorem as though it were a law of nature. Here’s an example of what CoI tells us: the probability is at most 1/2 that the first segment’s length will turn out to be greater than or equal to 2. More generally, for any given segment, the probability is at most 1/b that the segment’s length with turn out to be greater than or equal to b. This holds for sticks of all lengths n, broken into n segments. Recall that the random segment lengths are the random biases of the evolutionary process. CoI says that the greater the bias in favor of an outcome specified in advance, the less likely the bias is to have arisen by breaking a stick. The result is not useful in implicating design of biological evolution, as it assumes that an outcome was targeted in advance. To apply CoI, one must know not only that an outcome was targeted prior to the formation of the evolutionary process, but also which of the possible outcomes was targeted.3

Figure 2. In this frame from Animation 1, the segments of 20 broken sticks are colored according to their original positions. The expected length of each segment is 1, though the random lengths are highly variable. According to CoI, the probability is at most 1/2 that the length of the blue segment will turn out to be 2 or greater. More generally, for any given segment, the probability is at most 1/b that the length of the segment will turn out to be greater than or equal to b. This does not hold if we specify a segment in terms of the outcome of the random segmentation of the stick. In particular, CoI does not apply to the longest segment.

Figure 3. In this frame from Animation 1, the segments of each of the 20 broken sticks have been sorted into ascending order of length, and recolored. The expected length of the longest (red) segment is 2.45. By the Broken Stick Theorem, the probability is .728 that at least one of the segments is of length 2 or greater. By misapplication of CoI, the probability is at most 1/2. For a stick of length n, the probability is greater than 1/2 that at least one of the n segments exceeds \ln n in length. There is no limit on the ratio of probability 1/2 to the faux bound of 1/\ln n.

The Broken Stick Theorem tells us quite a bit about the lengths of segments. What is most important here is that, for any given length, we can calculate the probability that one or more of the segments exceeds that length. For instance, the probability is 1/2 that at least one of the segments is of length 2.338 or greater. If you were to misapply CoI, then you would say that the probability would be no greater than 1/2.338, which is smaller than 1/2. A simple way to measure the discrepancy is to divide the actual probability, 1/2, by the CoI bound, 1/2.338. The result, 1.169, is small only because the illustration is small. There is no limit on how large it can be for longer sticks. Let’s say that the stick is of length n, and is broken into n segments. Then the probability is greater than 1/2 that at least one of the segments exceeds \ln n in length. Here \ln n is the natural logarithm of n. The details are not important. What matters is that we can drive the faux bound of 1 / \ln n arbitrarily close to 0 by making n large, while the correct probability remains greater than 1/2.

Cool, but nonessential: The relation of the expected length of the i-th longest segment of a broken stick to the harmonic numbers. Here E[B_{(i)}] is the expected value of B_{(i)}, the i-th greatest of the random segment lengths (biases). As it happens, the notation E[\cdot], widely used in probability and statistics, was introduced by William Allen Whitworth, who derived the Broken Stick Theorem.

    \begin{align*} E[{B}_{(6)}] &= \frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6\\ E[{B}_{(5)}] &= \phantom{\frac{1}{1} +\;\,} \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_1\\ E[{B}_{(4)}] &= \phantom{\frac{1}{1} + \frac{1}{2} +\;\, } \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_2 \\ E[{B}_{(3)}] &= \phantom{\frac{1}{1} + \frac{1}{2} + \frac{1}{3} +\;\, } \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_3 \\ E[{B}_{(2)}] &= \phantom{\frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} +\;\, } \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_4 \\ E[{B}_{(1)}] &= \phantom{\frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} +\;\, } \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_5 \\ E[{B}_{(1)}] + \cdots + E[{B}_{(6)}] &= \frac{1}{1} + \frac{2}{2} + \frac{3}{3} + \frac{4}{4} + \frac{5}{5} + \frac{6}{6} = 6 \end{align*}

For large n, \mathcal{H}_n \approx \ln n + \gamma, where \gamma \approx 0.5772 is the Euler-Mascheroni constant. So the expected maximum bias (“active information”) of a randomly biased process is logarithmic in the number of possible outcomes. For large n,

    \[P(B_{(n)} > \ln n) \approx 1 - \frac{1}{e} \approx .6321.\]

The derivation is straightforward, but not brief. I decided that the loose bound

    \[P(B_{(n)} > \ln n) > \frac{1}{2}\]

better serves present purposes.

Rather than simply argue that the analysis of Dembski et al. does not apply, I have identified a comparable analysis that does apply, and used it to quantify the error in misapplying their analysis. The expected maximum bias (“active information”) for a randomly biased process (“search”) grows without bound as the size of the space of possible outcomes (“search space”) increases. For n possible outcomes, the probability is greater than 1/2 that the maximum bias exceeds \ln n. According to CoI, the probability is at most 1 / \ln n that the bias in favor of a given outcome is \ln n or greater. The discrepancy is entirely a matter of whether a possible outcome is targeted in advance of generating the process (“hit this”), or the most probable outcome of the process is targeted after the fact (“this is what it hits”). It should be clear that a scientist is free to do the latter, i.e., to investigate the most probable outcome of a process observed in nature.4 In Dembskian terms, the active information measure permits us to inspect the distribution of arrows shot into a wall by a blind archer, and paint a target around the region in which the density of arrows is greatest. There is no requirement that the target have the detachable specification that Dembski emphasized in his earlier writings.

Why a bug is not a weasel

In 1986, Richard Dawkins published The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design, a response to William Paley’s Natural Theology: or, Evidences of the Existence and Attributes of the Deity; Collected from the Appearances of Nature (1802). Dembski’s career in ID is largely a response to Dawkins. Indeed, the highlights are cooptations of ideas in The Blind Watchmaker. Dawkins characterizes objects that are complicated, and seemingly designed, as “statistically improbable in a direction that is specified not with hindsight.” Dembski elaborates on the self-same property in The Design Inference: Eliminating Chance through Small Probabilities (1998), taking it as information imparted to objects by design. A not-with-hindsight specification is, in his parlance, detachable from the specified event (set of objects), called the target. Dembski usually refers to complicatedness as complex specified information or specified complexity, but sometimes also as specified improbability. The last term gives the best idea of how it contrasts with active information, the elevated probability of an event not required to have a detachable specification. In No Free Lunch: Why Specified Complexity Cannot Be Purchased without Information, he states a Law of Conservation of Information for specified complexity. (As explained below, in Appendix 1, Dembski and Marks have never mentioned this LCI, since stating their LCI for active information.)

[This section is not complete. I expect to add the guts over the next day or so, which means, as Joe can tell you, that you should expect them sometime in December. The gist is that specified complexity does not apply to Dawkins’ Weasel program. Dembski has made much of the meaningfulness of the target sentence. But the fact of the matter is that the model is the same for all target sentences comprising 28 uppercase letters and spaces. The target need not be specified. The measure of active information formalizes Dawkins’ comparison the model to the proverbial monkeys at typewriters. It does not stipulate that the target have a detachable specification. Dembski and Marks seem to have thought that it was transparently obvious that Dawkins had selected a desired outcome in advance, and had informed (programmed) the evolutionary process to “hit the target.”]

Dembski discusses the Weasel program on pp. 176-180 of Being as Communion. Here is how he describes “search,” in general, and the Weasel program, in particular:

In The Blind Watchmaker, Dawkins purports to show how natural selection creates information. In that book, he gives his famous METHINKS IT IS LIKE A WEASEL computer simulation. A historian or literary scholar, confronted with the phrase METHINKS IT IS LIKE A WEASEL, would look to its human author, William Shakespeare, to explain it (the phrase is from Hamlet). An evolutionary theorist like Dawkins, by contrast, considers what it would take for an evolutionary process, simulated by an algorithm running on a computer, to produce this target phrase. All such algorithms consist of:

  1. an initialization (i.e., a place where the algorithm starts — for Dawkins the starting point is any random string of letters and spaces the same length as METHINKS IT IS LIKE A WEASEL);
  2. a fitness landscape (i.e., a measure of the goodness of candidate solutions — for Dawkins, in this example, fitness measures proximity to the target phrase so that the closer it is to the target, the more fit it becomes);
  3. an update rule (i.e., a rule that says where to go next given where the algorithm is presently — for Dawkins this involves some randomization to existing candidate phrases already searched as well as an evaluation of fitness along with selection of those candidates with the better fitness);
  4. a stop criterion (i.e., a criterion that says when the search has gone on long enough and can reasonably be ended — for Dawkins this occurs when the search has landed on the target phrase METHINKS IT IS LIKE A WEASEL).

Note that in these four steps, natural selection is mirrored in steps (2) and (3).

It is important to note that Dembski addresses algorithms, or designs of computer programs, in engineering terms, and does not address models (implemented by computer programs) in scientific terms. This amounts to a presumption, not a demonstration, that the computational process (running program) is designed to generate a desired outcome.

[What I hope to get across here is why Dembski et al. cannot misconstrue Felsenstein’s model, called the GUC Bug, as he does Dawkins’ model. Those of you who argue with ID proponents should put the tired old Weasel out to pasture, or wherever it is that old Weasels like to go, and give Felsenstein’s Killer Bug a try.]

Figure 4. Felsenstein’s GUC Bug model contrasts starkly with Dembski’s travesty of Dawkins’ Weasel program. There can be no argument that Felsenstein designed the model to hit a target, because we define the target in terms of the modeled process. The model implemented by the Weasel program is not terribly different. But it is terribly easy to brush aside the model, and focus upon the program. Then the claim is that Dawkins designed the program to hit a specified target with its output.

ID the future

It is telling, I believe, that Dembski gave a detachable specification, “teleological system/agent,” of the target for biological evolution in his seminar talk, and that Ewert gives a detachable specification of the event that he targets, birds, in his response to Felsenstein and me. Ewert addressed active information in his master’s thesis (2010), but developed a new flavor of specified complexity for his doctoral dissertation (2013; sequestered until 2018). He, Dembski, and Marks have published several papers on algorithmic specified complexity (one critiqued here, another here). Dembski indicates, in a footnote of Being as Communion, that he and Marks are preparing the second edition of No Free Lunch (presumably without changing the subtitle, Why Specified Complexity Cannot Be Purchased without Information). My best guess as to what to make of this is that Dembski et al. plan to reintroduce specification in LCI Version 3. One thing is sure: ever mindful of the next judicial test of public-school instruction in ID, they will not breathe a hint that their publications on active information are any less weighty than gold. Ewert has demonstrated some of the revisionary tactics to come.

Appendix 1: Contradictory laws on the books

There actually have been two Laws of Conservation of Information. The first, featured in No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence (Dembski, 2002), addresses the specified complexity, also known as the specified improbability, of an event. The second, featured in Being as Communion: A Metaphysics of Information (Dembski, 2014), addresses the active information of a process, supposedly necessary for “unnatural” elevation in probability of an event. Specified improbability is loosely the opposite of elevated probability. Dembski and Marks evidently saw better than to claim that both are conserved, as they have said nothing about the first law since coming up with the second. Although Dembski opens Being as Communion by indicating that it is the last book of a trilogy that includes No Free Lunch, his only mention of specified complexity is in a footnote listing examples of “materialist-refuting logic.” He also notes that he and Marks are preparing the second edition of No Free Lunch. To include both specified complexity and active information in the cutlery is to serve up free lunch. It equips the ID theorist to implicate design when an event is too improbable (relative to a probability induced by specification), and also when an event is too probable (relative to a probability asserted a priori).

Appendix 2: Remembrance of information past

Here I give ample evidence that the “search” really was supposed to search for the targeted event, and that “active information” really was supposed to account for its probability of success. I begin with two technical abstracts. If you find yourself getting bogged down, then read just the text I’ve highlighted. The first is for Dembski‘s seminar talk (August 2014).

Conservation of Information (CoI) asserts that the amount of information a search outputs can equal but never exceed the amount of information it inputs. Mathematically, CoI sets limits on the information cost incurred when the probability of success of a targeted search gets raised from p to q (p < q), that cost being calculated in terms of the probability p/q. CoI builds on the No Free Lunch (NFL) theorems, which showed that average performance of any search is no better than blind search. CoI shows that when, for a given problem [targeted event], a search outperforms blind search, it does so by incorporating an amount of information determined by the increase in probability with which the search outperforms blind search. CoI applies to evolutionary search, showing that natural selection cannot create the information that enables evolution to be successful, but at best redistributes already existing information. CoI has implications for teleology in nature, consistent with natural teleological laws mooted in Thomas Nagel’s Mind & Cosmos.

Apart from hiding a law of nature under a bushel, this is not much different from the abstract of “Life’s Conservation Law: Why Darwinian Evolution Cannot Create Biological Information” [sic] (Dembski and Marks, 2010; preprint 2008).

LCI characterizes the information costs that searches incur in outperforming blind search. Searches that operate by Darwinian selection, for instance, often significantly outperform blind search. But when they do, it is because they exploit information supplied by a fitness function — information that is unavailable to blind search. Searches that have a greater probability of success than blind search do not just magically materialize. They form by some process. According to LCI, any such search-forming process must build into the search at least as much information as the search displays in raising the probability of success. More formally, LCI states that raising the probability of success of a search by a factor of q/p (> 1) incurs an information cost of at least log(q/p). [… Conservation of information] theorems provide the theoretical underpinnings for the Law of Conservation of Information. Though not denying Darwinian evolution or even limiting its role in the history of life, the Law of Conservation of Information shows that Darwinian evolution is inherently teleological. Moreover, it shows that this teleology can be measured in precise information-theoretic terms.

The putative measure of teleology is log(q/p), the active information of the evolutionary search. Dembski also says in Being as Communion that a search is informed to find a target, not merely biased in favor of it.

A precise theoretical justification for the claim that natural selection is inherently teleological comes from certain recent mathematical results known as Conservation of Information (CoI) theorems [p. 148].

Simply put, searches, in finding targets output information. At the same time, to find targets, searches need to input information [p. 152].

CoI shows that successful search (i.e., one that locates a target) requires at least as much input of information as the search by its success outputs [p. 150].

The information that goes into formation of the search, to increase the probability that it finds the target, is active information. Returning to “Life’s Conservation Law” (Section 1, “The Creation of Information”):

Nature is a matrix for expressing already existent information. But the ultimate source of that information resides in an intelligence not reducible to nature. The Law of Conservation of Information, which we explain and justify in this paper, demonstrates that this is the case.

Dembski and Marks hold that the ultimate source of active information, which increases the probability that evolutionary search achieves a purpose, is supernatural intelligence. However, Ewert tells us that “active information,” regarded as bias instead of information, is not necessarily due to design.

The conservation of information does not imply a designer. It is not a fine-tuning argument. It is not our intent to argue that all active information derives from an intelligent source. To do any of those things, we’d have to introduce metaphysical assumptions that our critics would be unlikely to accept. Conservation of information shows only that whatever success evolutionary processes might have, it is due either to the original configuration of the universe or to design.

This reversal is not due to Ewert. He’s obviously adapting arguments in Being as Communion, though without citing a source.

Notes

1. Felsenstein and I give the bound for just the fittest of all genotypes. I’ve extended it to the set of all local maxima of the fitness landscape. We classify haploid organisms into genotypes according to their DNA bases in L positions of the genome. The neighbors of a genotype are the 3L genotypes that differ from it in exactly one of the L positions. We require that the fittest genotype of each neighborhood of K=3L+1 genotypes be unique. It follows immediately that at most one genotype per neighborhood is a local maximum of the fitness landscape, and that the ratio of the total number of genotypes to the number of local maxima is at least K. Evolution begins with a random genotype, proceeds along the path of steepest ascent on the landscape, and ends at a local maximum. The minimum degree of constraint on the final genotype in the process is K. This is also the minimum “active information” with respect to (targeting) the set of local maxima. That is, the probability is q = 1 that the final genotype is a local maximum. The uniform probability of the set of local maxima is p \leq 1/K. Finally, the active information, without conversion to a log scale, is q / p \geq K.

2. Although the term bias is technically acceptable — indeed, I have used it, and will continue to use it in contexts where constraint is inappropriate — Ewert earns scorn by abusing it in the most predictable of ways. The problem with referring to the bias of a natural process is that the general reader gets the idea that the process “naturally” ought to have behaved in some other way, and deviates only because something biased it. And thus the Designer enters through the back door, not by evidence or reason, but instead by rhetorical device. Usually, the meaning of bias is only that some of the possible outcomes of a process have different chances of occurring than do others. If this were always the case, then I would refer instead to the non-uniformity of the probability distribution on outcomes. By the way, I am not conflating all probabilities in scientific models with physical chances, as Dembski et al. generally do. Much of what is modeled as random in biological evolution is merely uncertain, not attributed to quantum chance. The vitally important topic of interpretations of probability, which Dembski deflects with a false analogy to interpretations of quantum mechanics (Being as Communion, p. 157), will have to wait for another post.

3. CoI applies more generally to events, meaning sets of possible outcomes. But that’s irrelevant to the logic, or lack thereof. For readers familiar with Dembski’s measure of specified complexity, I should mention that the measure of active information permits us to target any event whatsoever. There is no requirement that the event have a detachable specification. Dembski’s arguments to the effect that an event with a detachable specification might as well have been prespecified are irrelevant here.

4. What it means to investigate the most probable outcome of a process observed in nature is highly problematic. In particular, we generally cannot say anything sensible about the chances of possible outcomes of a one-shot process. Complex processes that have occurred once, and cannot be repeated, are what commonly interest evolutionary biologists. I should make it clear that I don’t agree with Dembski et al. that evolutionary biologists should make claims about the chances of this, that, and the other. I’m essentially playing along, to show that their math is not even applicable.

228 thoughts on “The Law of Conservation of Information is defunct

  1. Marlin: Hi Tom, please let me know if I’m understanding your article correctly. Are you saying the probability of randomly choosing a biased process is relatively high? E.g in the stick analogy if you select a broken stick at random, you’re very likely to have a stick with non uniform breaks. So based on this we can conclude that the mere presence of bias without a specified target is not a good indicator of design?

    You’ve summarized it better than I have. However, I want to tack onto the end, “… even if we accept the model of Dembski et al.”

    Dembski et al. are the source of the notion that an undesigned process is randomly biased, as though the chances of its outcomes were the lengths of the segments of a broken stick. I don’t agree, but don’t argue the point here. Even if they were right in the model that they foist on the design denier, CoI would be the wrong theorem to apply, because it assumes that the target is specified in advance. We may target the most probable outcome of an evolutionary process, corresponding to the longest segment of a stick that has already broken randomly to pieces. The Broken Stick Theorem tells us that, for a stick of length n broken randomly into n segments, the expected length of the longest segment is approximately the natural logarithm of n. That is, the expected maximum bias is about \ln n.

    The bias of Felsenstein’s GUC Bug in favor of the fittest genotype is quite improbably large according to the Broken Stick Theorem. It is at least 3001, and the expected maximum bias, applying the right theorem to the wrong model of an undesigned process (with 4^{1000} possible outcomes), is only 1387. If the model of Dembski et al. were right, then we would infer design. I’m not interested in arguing against their model here. For one thing, they have never offered any sort of justification for it. For another, I expect them to abandon active information, and pursue algorithmic specified complexity. They have a relatively simple “high ASC is improbable” theorem (because I corrected, perhaps unwisely, what they published).

  2. keiths:

    But of course that hasn’t gone very well for them, with at least 20 errors in their latest ASC paper:

    Algorithmic Specified Complexity and the Game of Life

    I linked to that post in the conclusion of the OP. I’m glad to have you spell out the title. The ASC measure is not itself as execrable as Ewert’s dissertation research. My litmus test for determining whether it really is A New Day for Dembski is to see whether he and Marks discuss how ASC is related to Gurevitch and Passmore, “Impugning Randomness, Convincingly.” It seems to me that they owe a lot to the paper.

    I’ll add a remark about Ewert’s response to Joe and me. We he talks about other possibilities for physics, and the initial “configuration” of the Universe, he’s clearly got the Life universe in mind. (Of course, God the Engineer would not build an infinite grid, according to the specification of John Conway the mathematician, because mindful Life would then be unable to infer that it exists to serve a supernatural purpose.)

  3. Tom,

    (Of course, God the Engineer would not build an infinite grid, according to the specification of John Conway the mathematician, because mindful Life would then be unable to infer that it exists to serve a supernatural purpose.)

    There might still be a way to infer the grid:

    In a recent paper, “Constraints on the Universe as a Numerical Simulation,” the physicists Silas R. Beane, Zohreh Davoudi and Martin J. Savage outline a possible method for detecting that our world is actually a computer simulation. Physicists have been creating their own computer simulations of the forces of nature for years — on a tiny scale, the size of an atomic nucleus. They use a three-dimensional grid to model a little chunk of the universe; then they run the program to see what happens. This way, they have been able to simulate the motion and collisions of elementary particles.But these computer simulations, Professor Beane and his colleagues observe, generate slight but distinctive anomalies — certain kinds of asymmetries. Might we be able to detect these same distinctive anomalies in the actual universe, they wondered? In their paper, they suggest that a closer look at cosmic rays, those high-energy particles coming to Earth’s atmosphere from outside the solar system, may reveal similar asymmetries. If so, this would indicate that we might — just might — ourselves be in someone else’s computer simulation.

  4. Actually Marlin, there is a clear target – getting to the next round. And getting to the next round apparently requires 6 sticks with numerous breaks.

    Is variation and selection responsible for the quantity of breaks and frequency of sticks?

    Not sure if DEM got it right but EF definitely got it wrong.

    this is the most likely reason for the rather late rebuttal of the rebuttal of the rebuttal.

    Marlin:
    Hi Tom, please let me know if I’m understanding your article correctly. Are you saying the probability of randomly choosing a biased process is relatively high? E.g in the stick analogy if you select a broken stick at random, you’re very likely to have a stick with non uniform breaks. So based on this we can conclude that the mere presence of bias without a specified target is not a good indicator of design?

  5. As far as I understand the sticks analogy, the argument is that we can’t attribute special significance to a configuration based solely on its bias. After all, any process configuration is just as likely as any other in the absence of a specified target. But lets say we do have a specified target, would CoI show that the process which produced the target is specified?

  6. Tom English:

    For another, I expect them to abandon active information, and pursue algorithmic specified complexity.

    The problem with the Algorithmic Information Theory approach to inferring Design is that it gives a high numerical value for Design when the organism is complex, but can be constructed by a very simple algorithm. I dealt with this in passing in my 2007 article critiquing Dembski’s arguments.

    The difficulty with the argument, and with the ASC definition, is that a messy developmental process producing a highly fit organism gets a lower score than a completely unfit organism that has a simple algorithm that can produce it. In short, it does not focus on highly fit adaptations.

    By the ASC criterion, an actual fish that can swim does not indicate Design nearly as much as, say, an organism whose genome spells out 1 billion digits of \pi.

  7. Tom English:

    Marlin: Hi Tom, please let me know if I’m understanding your article correctly. Are you saying the probability of randomly choosing a biased process is relatively high? E.g in the stick analogy if you select a broken stick at random, you’re very likely to have a stick with non uniform breaks. So based on this we can conclude that the mere presence of bias without a specified target is not a good indicator of design?

    You’ve summarized it better than I have. However, I want to tack onto the end, “… even if we accept the model of Dembski et al.”

    Dembski et al. are the source of the notion that an undesigned process is randomly biased, as though the chances of its outcomes were the lengths of the segments of a broken stick.

    I’m still not getting the broken-stick argument.

    1. Do Dembski, Marks, and/or Ewert actually mention broken sticks somewhere? If so, I missed it. Or is this your teaching example?

    2. I think that their main theorem says that the probability of getting a preassigned target is the same under blind search as under a randomly chosen search. Is the random breaking of sticks model that you use a model for assigning fitnesses to genotypes? If so, which are the genotypes, the points or the line segments?

    I see the math of largest sticks in a broken stick model. What I am not seeing is the analogy between that process and the ones DEM discuss, or the GUC Bug that we use.

  8. Tom, ignore my questions above. I see an (the?) analogy now.

    If we have N genotypes, and a Broken Stick process assigns fitnesses to them all, by making N-1 breaks and using the lengths of the pieces as the fitnesses of the corresponding genotypes, we assume that natural selection finds the largest fitness and the population becomes that genotype. The broken stick theorems then show that this achieves a fitness of log(N) on average. The average fitness at the start was 1/N.

    By CoI, we expect an average final fitness of 1/N. Does this model violate CoI? I think we both agree that it does, but perhaps we have different diagnoses.

    You say that the problem is that the Target is selected after the fitnesses are assigned (right?).

    I say that we did not use a randomly chosen DEM-search, but instead assumed that there were fitnesses that affected the genotype frequencies. By contrast, DEM searching averages over a wildly mixed bag of “searches” that include crazy ones that prefer lower fitnesses, or even ignore fitnesses altogether. So just by having genotypes that had fitnesses, this model is able to do a lot better than CoI says it could.

    So I see how you are using the Broken Stick model, though we differ on why it seems to defy the CoI theorems.

  9. I’m struggling with the step where are sticks are recolored by length. Without recoloring the average probability for a specified segment color is 1/N. Given that there are N possible colors, CoI applies in this case.

    But if we specify the longest segment (as in the recoloring step), the probability of getting the target is now 1 (because some segment has to be longest, ignoring the possibility of equal lengths). And if we average this across an arbitrary number of sticks, the probability is still 1. So it would seem CoI applies here too. Am I missing something?

  10. The following redoes the opening of a previous attempt at clarification, but does not replace it entirely. I’ve benefited, I think, from feedback, and ask for more. Something like the following, along with the non-redundant part of the comment I’ve linked to, will go into the opening post. I written and unwritten a lot for the incomplete section, but am slowly making progress. (I’ve reread and rethought quite a bit of Paley, Dawkins, and Dembski.)
    __________________________________

    Dembski et al. represent natural processes abstractly. In their math, they reduce the evolutionary process to nothing but the chances of its possible outcomes. The CoI theorem is indifferent to what the possible outcomes actually are, in physical reality, and how the process actually works, that the outcomes should have the chances of occurrence that they do. Here I assume that there are only 6 possible outcomes, arbitrarily named 1, 2, 3, 4, 5, 6. The possible outcomes could be anything, and their names say nothing about what they really are. Each of the possible outcomes has a chance of occurrence that is no less than 0 (sure not to occur) and no greater than 1 (sure to occur). The chances of the possible outcomes are required to add up to 1.

    As far as the CoI theorem is concerned, an evolutionary process is nothing but a list of chances that sum to 1. I’ll refer to the list of chances as the description of the process. The first chance in the description is associated with the possible outcome named 1, the second chance in the description is associated with the possible outcome named 2, and so forth. The list

        \[.1, \quad .3, \quad .1, \quad .2, \quad .1, \quad .2\]

    is a valid description because each of the numbers is a valid chance, lying between 0 and 1, and because the total of the chances is 1. We can picture the description of the evolutionary process as a stick of length 1, broken into 6 pieces.

    [Need a new figure here.]

    Naming the segments 1, 2, 3, 4, 5, 6, from left to right, the length of each segment indicates the chance of the possible outcome with the corresponding name. Consequently, the depiction of the evolutionary process as a broken stick is equivalent to the description of the process as a list of the chances of its possible outcomes.

    You perhaps wonder how I would depict the evolutionary process as a broken stick if a “possible” outcome had absolutely no chance of occurring. And the answer is that I could not. There is no segment of length 0. In the CoI theorem, however, chances precisely equal to 0 are effectively impossible. Thus it is not misleading to say that Dembski et al. reduce the evolutionary process to a broken stick.

    There are infinitely many ways to break our metaphorical stick into a given number of segments. Averaging over all of them, the lengths of the segments are

        \[\frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}.\]

    That is, in the average description of an evolutionary process, the possible outcomes are uniform in their chances of occurrence. Dembski et al. usually advocate taking uniform chances as the standard of comparison for all processes (though they allow for other standards in the CoI theorem). Dembski and Marks go much further in their metaphysics, claiming that there exist default chances of outcomes in physical reality, and that we can obtain knowledge of the default chances, and that deviation of chances from the defaults is itself a real and objectively measurable phenomenon. Although I want to limit myself to illustrating how they have gone wrong in application of CoI, I must remark that their speculation is empty, and comes nowhere close to providing a foundation for an alternative science. Otherwise, I would seem to allow that they might repair their arguments with something like the Broken Stick Theorem.

    Taking uniform chance as the standard to which all evolutionary processes are compared, we naturally arrive at an alternative representation. We begin by writing the standard description a bit differently, multiplying each of the chances by 1.

        \[1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}.\]

    Now we can write any description whatsoever by adjusting the multipliers, while leaving the fractions 1/6 just as they are. The trick is to multiply each of the chances in the description by 1, but with 1 written as 6 \times 1/6. For instance, the description

        \[\frac{1}{24}, \quad  \frac{1}{3}, \quad  \frac{1}{12}, \quad  \frac{1}{4}, \quad  \frac{1}{6}, \quad  \frac{1}{8}\]

    is equivalent to

        \[\frac{6}{24} \times \frac{1}{6}, \quad  \frac{6}{3} \times \frac{1}{6}, \quad  \frac{6}{12} \times \frac{1}{6}, \quad  \frac{6}{4} \times \frac{1}{6}, \quad  \frac{6}{6} \times \frac{1}{6}, \quad  \frac{6}{8} \times \frac{1}{6}.\]

    The multipliers

        \[\frac{6}{24}, \quad  \frac{6}{3}, \quad  \frac{6}{12}, \quad  \frac{6}{4}, \quad  \frac{6}{6}, \quad  \frac{6}{8}\]

    are the biases of the process, relative to the standard in which the chances are uniformly 1/6. The process is biased in favor of an outcome when the bias is greater than 1, and biased against an outcome when the bias is less than 1. For instance, the process is biased in favor of outcome 4 by a factor of 6/4 = 1.5, meaning that the chance of the outcome is 1.5 times as great as in the standard. Similarly, the process is biased against outcome 1 by a factor of 24/6 = 4, meaning that the chance of the outcome is 6/24 = 0.25 times as great as in the standard. The uniform standard is unbiased relative to itself, with all biases equal to 1.

    The general rule for obtaining the biases of an evolutionary process, relative to the uniform standard, is to multiply the chances by the number of possible outcomes. With 6 possible outcomes, this is equivalent to scaling the the broken stick to a length of 6. We gain some clarity in discussion of CoI by referring to the biases, instead of the chances, of the evolutionary process. The process is metaphorically a broken stick, either way. Whether the segment lengths are biases or chances is just a matter of scale. We shall equate the length of the stick to the number of outcomes, and thus depict the biases of the process, for and against the possible outcomes corresponding to the segments.

  11. Tom English,

    Dumb question, Tom. Assume a roulette wheel with six sectors. By varying the segments and playing, the chance of the pill dropping into any sector varies as to the size of each sector, while the overall probability of the pill landing in a sector is 1. Is this equivalent to your sticks?

  12. Tom English: The CoI theorem is indifferent to what the possible outcomes actually are, in physical reality, and how the process actually works, that the outcomes should have the chances of occurrence that they do.

    I think a couple of words are missing from this sentence, between “works,” and “that”. They might be something like “and only assumes”.

    Anyway, I think that this means that to apply CoI to a set of DNA sequences, in a population small enough to have only one DNA sequence at a time, the biases of the sticks should then be the consequence of different probabilities of different sequences occurring and getting fixed. Since one ends up with one of them and never changes from that, we would have to have infinitely strong natural selection, which would result in the final result always being the most fit sequence. Thus the biases would have to be something like 0, 0, 0, \infty, 0, 0 since the probabilities would be 0, 0, 0, 1, 0, 0.

    Or else (1) this process could have a chance of no mutation occurring, in which case the biases would be less extreme since there would be a chance of the initial sequence remaining instead of being replaced by the most fit sequence. Or (2) the sequences are not all adjacent in sequence space so one ends up at the most-fit sequence adjacent to one’s starting sequence. Or (3) both of the above, so one ends up either at one’s starting sequence, or at its most fit neighbor.

  13. Marlin:

    As far as I understand the sticks analogy, the argument is that we can’t attribute special significance to a configuration based solely on its bias. After all, any process configuration is just as likely as any other in the absence of a specified target. But lets say we do have a specified target, would CoI show that the process which produced the target is specified?

    Preliminary remark: We have to keep in mind (I have not done such a great job of it) that Dembski’s basic distinction is (was?) between specification and fabrication of targets. He regards an event known to have been prespecified as specified, not fabricated, irrespective of whether it has a detachable specification. (Specification in algorithmic specified complexity is a different ball of wax.)

    I need to find a good way of clearing this up. There are two distinct components of what Dembski et al. are doing:

    (1) They reduce the evolutionary process to nothing but a chance distribution. This amounts to saying that the process is a stick of unit length, broken into segments. The segment lengths correspond to the chances of the possible outcomes. Equate the length of the stick to the number of possible outcomes, and presto change-o, the chances become the biases of the process, relative to the uniform distribution.

    (2) They foist on the design denier the claim that the biases are the chance outcome of an antecedent process. This amounts to saying not only that the evolutionary process is a broken stick, but also that the stick that has broken randomly.

    Although I regard both (1) and (2) as ludicrous, I do not need to oppose them. I in fact play along with them, in order to show that CoI is the wrong theorem. The glaring error in applying it is that it assumes a prespecified event, though we can measure bias for any event whatsoever. I don’t believe that conclusions about bias as general as yours are warranted by my argument. If one actually agrees with both (1) and (2), then high bias is indeed suspicious. (And the fact that classical physics works as well as it does, treating all processes as deterministic, is the wonder of all wonders.)

    I’m not clear on what you mean by “process configuration” in this context.

    You’ve helped me get some things straighter. Thanks for that. And I of course would appreciate further comments.

  14. Alan Fox:

    Dumb question, Tom. Assume a roulette wheel with six sectors. By varying the segments and playing, the chance of the pill dropping into any sector varies as to the size of each sector, while the overall probability of the pill landing in a sector is 1. Is this equivalent to your sticks?

    As I said early in the thread (not that anyone should remember anymore), “Bad writer, no dumb questions.” And you’ve pretty much nailed it there. The arc lengths of the sectors are essentially the same as the lengths of the segments of the broken stick.

    I’ve developed an aversion to such examples, however, in responding to Dembski. (And that was a huge problem for me, in writing the post.) The reason is that he’s committed to physical chance. Processes like coin tosses and dice rolls and roulette wheel spins are governed by classical mechanics, and are unpredictable due to poor control, not due to anything that passes for intrinsic physical randomness. The probabilities we talk about are not chances. (Even the uniform fragmentation of the stick bugs me, because I think fragmentation processes in the real world are deterministic chaos.) The only halfway-decent approach I have is to say that the spin of your roulette wheel is simulated by a computer drawing on a source of quantum randomness (assuming that quantum randomness is real).

    Your illustration is no worse than that of uniform fragmentation of the stick.

  15. Marlin: I’m not clear on what you mean by “process configuration” in this context.

    Sorry, I should have explained that a bit better. Imagine an N-dimensional space where all possible processes exist (or if you like, all possible codified descriptions of processes). By “process configuration” I just mean the coordinate in that space which represents a particular process.

    According to CoI, the chance of randomly selecting a prescribed target within the space of all possible targets (which I’ll call artifacts) is not greater than the chance of randomly selecting the process which produces the target.

    My point was that if we decide to change the specification to say ‘the fittest artifact is specified’, that is equivalent to saying ‘one artifact will be randomly chosen and we don’t care which’. The probability of this event is 1. According to CoI, this probability should be conserved in process configuration space too. And it turns out, no matter which process we randomly select, that process will produce an artifact which matches the specification. So the probability of this event is also 1. Therefore CoI still applies in this case, even though the specification is so general as to be useless.

  16. Alan Fox:

    Assume a roulette wheel with six sectors. By varying the segments and playing, the chance of the pill dropping into any sector varies as to the size of each sector, while the overall probability of the pill landing in a sector is 1. Is this equivalent to your sticks?

    Thinking about how I might use this, I noticed that the pill (we call it a ball, over here) would have to be infinitesimal, because there’s no limit to how narrow a sector might be. Tacking that proviso onto the illustration doesn’t bother me.

    I’m thinking (out loud, which is usually a very bad idea) that I need to get over my chance-versus-contingency persnicketiness when illustrating concepts. There’s a reason that my stick breaks into 6 segments, and not 5. I had considered talking about biasing (loading) a die according to the segment lengths. The bias of a roulette wheel with pockets unequal in width is plainly visible, and is much easier to wrap your brain around than the bias of a die.

    The sum of the numbers on the wheel is 666, matching Exercise 666 in Choice and Chance. Must be a sign.

    To get a roulette wheel with pockets of random widths, visually, all you do is to curl the broken stick into a circle. At the moment, I can’t think of how to illustrate the inapplicability of CoI as well with a randomly sliced pie as with a randomly broken stick. There’s no obvious way to juxtapose pie slices as I do stick segments. The best I can think to do is to stack pie slices, smaller atop bigger. I’ve actually got a segment-stacking phase in my animation program. I stopped using it because I thought it made things less, not more, clear. At any rate, if anyone has ideas about animation of random pie-slicing, I’d like to hear them.

    ____________________

    Dembski et al. reduce the evolutionary process to a spin of a roulette wheel. However, the widths of the pockets in which the ball may land are generally not equal. […] If we deny the possibility of design, then they foist on us the view that the pocket widths were set randomly by another random process. [Something like that.]

  17. Joe Felsenstein,

    The maximum possible bias is the reciprocal of the minimum chance in the reference (standard) distribution. With a uniform distribution on 6 possible outcomes as the standard of comparison, all of the chances are 1/6. Thus the minimum chance is 1/6, and the maximum bias is 1/(1/6) = 6. For a process with the distribution 0, 0, 0, 1, 0, 0 that you give, the bias, relative to the uniform distribution, in favor of the fourth outcome is 6. I see infinity only when considering the biases against the other five outcomes, (1/6)/0 = \infty.

    Free association: It’s bugged me that Dembski et al. see nothing special in

        \[\frac{1}{5}, \frac{1}{5}, \frac{1}{5}, 0, \frac{1}{5}, \frac{1}{5}.\]

    The case in which a process is designed to ensure that something does NOT occur seems important to me.

  18. Tom English,

    You’re right, the biases should be (0, 0, 0, 6, 0, 0).

    To get those in a model of natural selection, we need fitnesses that have one genotype infinitely more fit than its neighbors.

    So, as to the difference between us, the model gets an outcome very different from that expected from the CoI. Given that their theorem is mathematically OK, which of these is the reason?

    1. (TE) It is because the Target is chosen after the lengths of the sticks are determined, or

    2. (JF) it is because we are starting with a population that has genotypes that have fitnesses, wheras DEM assume we start with a gemisch of mostly-crazy processes that are much worse than reproducing organisms.

    I’d he interested to hear your thoughts on this issue.

  19. Joe Felsenstein,

    I’m confused by infinitely strong selection and infinite fitness. I’d rather respond in terms of the GUC Bug, if that’s OK with you. For now, the short answer is that we appealed to judgment in our PT post. “This is all it takes to get lots of ‘active information.’ No one regards the mere existence of genotypes with fitness as design.” We said little about CoI. It’s evident in Ewert’s response that Dembski et al. still think that their theorem gets them somewhere. I’ve identified the simplest, not the only, reason that it does not. My objective is merely to deprive them of the theorem.

  20. I don’t think we’re saying that the theorem is wrong, but rather that the way they apply it does not show that one needs Design to get a lot of Active Information. So I’m not sure what you mean by depriving them of the theorem.

    The same holds for the No Free Lunch theorem of Wolpert and MacReady. The theorem is correct, but the application of it by Dembski to show that evolution will make little or no increase of fitness is wrong.

  21. Joe Felsenstein: So, as to the difference between us, the model gets an outcome very different from that expected from the CoI. Given that their theorem is mathematically OK, which of these is the reason?

    1. (TE) It is because the Target is chosen after the lengths of the sticks are determined, or

    2. (JF) it is because we are starting with a population that has genotypes that have fitnesses, wheras DEM assume we start with a gemisch of mostly-crazy processes that are much worse than reproducing organisms.

    I worked hard at responding to this, and realized eventually that I do not frame the issue as you do.

    (JF) Nature is constrained to non-crazy processes. Most mathematically describable processes are physically impossible. Physically and biologically realistic constraint on the evolutionary process registers as active information.

    (WE) Nature is biased in favor of non-crazy processes. Bias is conserved, so the bias must be due either to design or to bias in the initial configuration of the Universe. Darwinian evolutionary theory is incomplete, inasmuch as it describes bias, but doesn’t account for the bias.

    (TE) The argument for conservation of bias in nature is based on an assumption that does not hold in scientific investigation, namely, that the bias of a process is measured with respect to a prespecified event. JF in fact violates that assumption.

    You shut down the claim that natural selection implies purpose. You did not shut down the claim that something must bias nature for phenomena to be constrained. Neither did I. What I did was to shut down the particular argument for it, in the simplest way possible.

  22. Tom English:
    (JF) Nature is constrained to non-crazy processes. Most mathematically describable processes are physically impossible. Physically and biologically realistic constraint on the evolutionary process registers as active information.

    That is a fair-enough summary of my view.

    (WE) Nature is biased in favor of non-crazy processes. Bias is conserved, so the bias must be due either to design or to bias in the initial configuration of the Universe. Darwinian evolutionary theory is incomplete, inasmuch as it describes bias, but doesn’t account for the bias.

    Well, OK, but the incompleteness is that various parameters such as the fitnesses of individual genotypes are not easily derivable from our knowledge of physics and chemistry. That is because they depend on the physiology, developmental biology, and ecology, all of which is not precisely known to us. What will a particular change of genotype do the the length of the wing of a bird? And what will that do the the fitness?

    (TE) The argument for conservation of bias in nature is based on an assumption that does not hold in scientific investigation, namely, that the bias of a process is measured with respect to a prespecified event. JF in fact violates that assumption.

    Not sure I understand that.

    You shut down the claim that natural selection implies purpose. You did not shut down the claim that something must bias nature for phenomena to be constrained. Neither did I. What I did was to shut down the particular argument for it, in the simplest way possible.

    OK, for the reasons I gave above.

    I could understand onlookers saying that, look, you two know each other’s email addresses, so why have this discussion here? I am doing this because I hope that the points will be of interest to some readers.

  23. They are of interest. They will be even more interesting as you develop accessible summaries.

  24. Joe Felsenstein,

    I could understand onlookers saying that, look, you two know each other’s email addresses, so why have this discussion here? I am doing this because I hope that the points will be of interest to some readers.

    Please keep it going here. You have at least one avid reader.

  25. Thanks for the vote of confidence in our discussion. While we’re discussing these points, let me detour for a moment to discuss how fitnesses come into this, and my assertion that a process that always chooses the genotype that has highest fitness is one that has infinitely high fitness of that genotype.

    Suppose we have two (say, haploid) genotypes that have fitnesses in the ratio of F:1. If their frequencies are initially p : 1-p then, after selection, among an infinite number of their offspring the survivors are in the ratio of pF : 1-p. If we sample N individuals to survive, they will not all be of the superior genotype unless F is extremely large.

    So any model in which the fittest offspring is guaranteed to survive has in effect infinite ratio of fitness of that genotype to the others.

  26. petrushka: They are of interest. They will be even more interesting as you develop accessible summaries.

    Joe is attempting kindly to draw it out of me. From what I’ve seen of his process, he starts with clear summaries.

  27. Joe Felsenstein:

    Tom English:

    (JF) Nature is constrained to non-crazy processes. Most mathematically describable processes are physically impossible. Physically and biologically realistic constraint on the evolutionary process registers as active information.

    That is a fair-enough summary of my view.

    I’m glad you don’t mind seeing it labeled (JF), because it’s a quick stab at summarizing my take on our view (JF-TE). I’m not aware of having said anything contradictory in the post. I emphasized in the introduction that the “active information” of evolution by natural selection is constraint, mistaken for purpose by ID proponents until recently, and now downgraded by them to bias (as in “nature would not be so heavily biased unless something had biased it”). I don’t know that I should have reviewed (JF-TE) in the post. However, if something I’ve written seems to contradict it, then I should add some clarification.

    (WE) Nature is biased in favor of non-crazy processes. Bias is conserved, so the bias must be due either to design or to bias in the initial configuration of the Universe. Darwinian evolutionary theory is incomplete, inasmuch as it describes bias, but doesn’t account for the bias.

    Well, OK, but the incompleteness is that various parameters such as the fitnesses of individual genotypes are not easily derivable from our knowledge of physics and chemistry. That is because they depend on the physiology, developmental biology, and ecology, all of which is not precisely known to us. What will a particular change of genotype do the the length of the wing of a bird? And what will that do the the fitness?

    It’s often observed that scientific theories are incomplete. The ID movement has coopted some of the verbiage, and given “incomplete” a different meaning: all naturalistic materialistic accounts of nature are incomplete in the sense that they cannot explain why nature is the way it is. The rhetorical strategy is to insinuate that a complete explanation of nature necessarily includes supernatural cause, i.e., avoiding outright mention of the supernatural, which is deemed inherently religious in U.S. case law. The message of “conservation of information” is that there is a mathematical proof, not “just a theory,” telling us that wondrous events like birds, improbable by default, occur only because nature is in-formed to make them much more probable. If we deny that Intelligence dynamically informs material processes (creates information) to serve purposes, then we must accept that the Universe was front-loaded with spectacular quantities of active information. Otherwise it would not have hit all of the targets that it has. Now, “conservation of information” says nothing about the source of information in the origin of the Universe. But it establishes that natural selection does not create the information of biological systems. Darwinism is incomplete because it does not account for the information of life. Where the information came from is for children in public schools to decide. P.S.–What we call active information is bias.

    (TE) The argument for conservation of bias in nature is based on an assumption that does not hold in scientific investigation, namely, that the bias of a process is measured with respect to a prespecified event. JF in fact violates that assumption.

    Not sure I understand that.

    I should not have written JF, because I’m referring to how Joe Felsenstein targeted an event when measuring the bias of the GUC Bug process, not the view (JF). Dembski et al. regard the fitness landscape as part of the evolutionary process. All that you stipulated regarding the landscape was that no two genotypes have precisely the same fitness. That made the global maximum in fitness unique, but allowed it to be any one of the genotypes. You said, whichever of the genotypes is the fittest of all, that’s the target. In other words, you defined the target in terms of the evolutionary process, whatever it might be.

    I’ll admit here to whomever is bothering to read that I delayed our PT post for a long time, arguing that Dembski would dismiss your approach. You were right, and I was wrong. (I would put that in writing, and leave it on your desk — even sign it without being asked. But, considering that you’ve already got a note like that from Dick Lewontin, I don’t know why you’d want one from Tom English.) We can measure “active information” on any event we choose. I knew that the CoI theorem required that a particular event be targeted in advance of formation of the process, so it didn’t seem to me that you were violating “conservation of information.” It took me an embarrassingly long time, after Ewert responded, to home in on what is obviously the most important observation: what you did, measuring active information with respect to the fittest genotype, is the right sort of thing for a scientist to do, and the CoI theorem, no matter that it is mathematically correct, is the wrong theorem to invoke in response.

    There are various things that make the CoI theorem objectionable. But I need only address the most obvious of them to stop it cold. And that is the requirement that the target be prespecified. I have responded very narrowly to the theorem, and not to everything that Dembski et al. are claiming about conservation of bias. My neglect of the other points here is neither a contradiction nor an abandonment of (JF-TE). In retrospect, I’m glad that I’m not making all of my objections public. Let Ewert and Marks, or a not-so-retired Dembski, sink a lot of time into producing a replacement CoI theorem, without any guidance from me. Then they can find out what I’m holding in reserve. (I’m sure there are folks who see more than I do, and keep it to themselves.) Better yet, let them realize that a law of conservation of active information is a lost cause, and stick to algorithmic specified complexity.

  28. I still have the feeling that we are somehow disagreeing about something.

    I think we are agreed that, as a theorem in mathematics, the relevant DEM theorem is correct, and that the issue we have raised is whether applying it to biological evolution establishes that the successes of the evolutionary process in bringing about adaptation mean that the Specified Information is frontloaded into the system. We have argued, using the GUC Bug, that all it need mean is that there are organisms that reproduce, and that the ordinary laws of physics apply. Evolution does much better than DEM’s typical blind search because those blind searches include so many crazy, unbiological, and unphysical, searches.

    How do you use the Broken-Stick Model as an example of a process that makes the same point? I think there is a part of your explanation missing, that I did not understand. It is something like

    Consider a process that breaks a stick at random places into 6 pieces. A mark was placed at a random point on the original stick, and we designate as our outcome the piece that contains that mark. Here is some math showing how large the resulting piece of stick is. [Math upthread inserted here]. Such a choice of a stick gets a piece that is nonrandomly longer than a random one of the 6 pieces, chosen uniformly. DEM describe this bias as “active information”. Surely that is not what most people would consider to be reasonably described as information that was frontloaded into the process.

    Are those sentences that could reasonably appear in your argument?

  29. Joe Felsenstein,

    Well, how much information is it fair to give a computer program, to come up with a solution, before one can say that the program has information preloaded into the equation?

  30. Joe Felsenstein,

    Because you replied to the question here. So your criticism of Frankie deserves to be right here. How much information can an algorithm be given , before we can say that it has been front loaded with information?

    How’s about you just answer. Or just admit that in order to make any evolutionary algorithm it is absolutely necessary to first front load it with all kinds of information-and just save time.

  31. Mung,

    Winston Ewert responds:

    Thanks for the link. Interesting that he chooses to respond in a venue where comments are not allowed, rather than engaging here or at The Panda’s Thumb.

  32. Tom English,

    Please don’t feel pressured to respond when not well. I’m re-reading Ewert’s article Specified Complexity — Like Déjà Vu All Over Again and it’s not too incomprehensible. He still doesn’t seem to take on board that evolutionary processes such as natural selection are not searches. He also writes:

    Essentially, Felsenstein presents specified complexity as circular. It is true that specified complexity does not in any way help establish that the probability of complex life is low under natural selection. You must have another way of showing that, for example Michael Behe’s irreducible complexity, Doug Axe’s work on proteins, or Stephen Meyer’s work on the Cambrian explosion.

    As if Behe, Axe and Meyer have demonstrated something about specified complexity which is far from the case.

  33. Although Winston Ewert’s 12/2/2015 ENV post mentions both Tom and I, that post is a reply to my post about Jonathan McLatchie’s 2-minute “1-minute” video. It is not a reply to the posts Tom and I made about the DEM paper. So it’s up to me — I will reply to it in a day or two here, in a new thread that I will create. Kind of busy today.

  34. Alan Fox,

    No, they have demonstrated something about irreducible complexity. Small incremental mutations are not going to add up to irreducibly complex sytems, if all of the parts needed for the systems don’t provide anything useful until the entire system is complete. Its a pretty obvious observation really.

    Kenneth Millers silly little analogy about the pieces all first being used by the organism for something else -like a mousetrap being used for a tieclip- not withstanding, when you want to build a car, its pretty ridiculous to claim that the vacuum hose and the drive shaft and the speedometer were all used for something else sometime earlier. I think this is one of the Darwinists funniest, completely unevidenced apologetics.

  35. phoodoo: Kenneth Millers silly little analogy…

    The analogy is pretty silly, I agree but wasn’t it Michael Behe who first talked about mousetraps as an analogy for “irreducibly complex” biological systems?

    …about the pieces all first being used by the organism for something else -like a mousetrap being used for a tieclip- not withstanding, when you want to build a car, its pretty ridiculous to claim that the vacuum hose and the drive shaft and the speedometer were all used for something else sometime earlier. I think this is one of the Darwinists funniest, completely unevidenced apologetics.

    And if Darwinian evolution had anything to do with the car industry you might have a point. On the other hand, Ken Miller has a neat article on his website about a biological system that might help you understand how a biological system is not like a car assembly plant.

  36. Barry has (sort of) replied:

    The Statue of Liberty is OK. And so is ID

    Although as usual, he’s wrong:

    Barry says, “Anyone who has been following the ID debate for any length of time knows that reports of ID’s demise are issued by our opponents with monotonous regularity. And just as monotonously, those reports turn out to be false.”

    No Barry, we’re actually lamenting its failure to start. Scientifically, anyway. That bit you don’t understand nor have attempted to critique in your article about the statue of liberty.

Leave a Reply