The Law of Conservation of Information is defunct

About a year ago, Joe Felsenstein critiqued a seminar presentation by William Dembski, “Conservation of Information in Evolutionary Search.” He subsequently discussed Dembski’s primary source with me, and devised a brilliant response, unlike any that I had considered. This led to an article, due mostly to Felsenstein, though I contributed, at The Panda’s Thumb. Nine days after it appeared, Dembski was asked in a radio interview whether anyone was paying attention to his technical work. Surely a recipient of

qualifies as a someone. But Dembski changed the topic. And when the question came around again, he again changed the topic. Mind you, this isn’t how I know that Felsenstein blasted conservation of “information,” which is not information, in evolutionary “search,” which does not search. It’s how I know that Dembski knows.

Or, I should say, it’s how I first knew. The Discovery Institute has since employed Dembski’s junior coauthor, Winston Ewert, to quietly replace various claims, including the most sensational of them all (Dembski and Marks, “Life’s Conservation Law: Why Darwinian Evolution Cannot Create Biological Information,” 2010; preprint 2008):

Though not denying Darwinian evolution or even limiting its role in the history of life, the Law of Conservation of Information shows that Darwinian evolution is inherently teleological. Moreover, it shows that this teleology can be measured in precise information-theoretic terms.

Felsenstein realized that we could apply their measure to a simple model of evolution by natural selection, devoid of purpose, and obtain a large quantity. According to the model, evolution begins with a random genotype, and ends with a genotype fitter than all of its neighbors. The neighbors of a genotype are those that can arise from it by mutation in a single point. In each step of the evolutionary process, a genotype is replaced by the fittest of its neighboring genotypes. The overall result of evolution is a sequence of genotypes that is unconstrained in how it begins, and highly constrained in how it ends. Each genotype in the sequence is not only fitter than all of the genotypes that precede it, but also fitter than all of their neighbors. That is, evolution successively constrains the genotype to smaller and smaller subsets of the space of genotypes. The final genotype is at the very least fitter than all of its neighbors. Equivalently, the minimum degree of constraint is the neighborhood size. Dembski and Marks mistake this for the degree of teleology (purpose) in evolution, and refer to it as active information. The gist of “conservation of information” is that teleology comes only from teleology. As Dembski said in his seminar presentation:

If you will, the teleology of evolutionary search is to produce teleology.

Considering that the neighborhood size indicates only how many, not at all which, genotypes are eliminated in a single step of evolution, there can be no argument that constraint implies purpose.1 Ewert does not hazard an overt reply, but in fact responds by downgrading “active information” from a measure of teleology to a measure of bias. The new significance of “conservation of information” is this: if the constraint, er, bias of a natural process is not due to design, then nature itself must be constrained, er, biased.2 We have it from Ewert, writing on behalf of Dembski and Marks, that:

Of course, Darwinian evolution is not a teleological process and does not search for a goal [e.g., birds…] Whatever search or process might be in play, … it produces birds much more often than chance would otherwise lead us to predict. It is this bias towards producing a bird that we call active information. […] Having postulated Darwinian evolution, … the fact that birds exist has to be explained in terms of the initial configuration of the universe. The universe must have begun with a large amount of active information with respect to the target of birds.

Although “information” stumbles on, searching for brains to eat, the vital principle has departed from the Law of Conservation of Information (LCI). No more does LCI show what it shows. The credit for dispatching teleology goes entirely to Joe Felsenstein. You should have a look at his latest, “Why ID Advocates Downplay Our Disagreement With Them,” before watching me deliver a round to the frontal lobe of the Conservation of Information Theorem.

The credit for keeping things surreal goes entirely to the Discovery Institute. Replacing Dembski, a full-time senior fellow, with Ewert in an exchange with a renowned evolutionary geneticist is beyond bizarre. But it is perhaps no accident that the authorship of the response serves the same purpose as its rhetorical tactics, namely, to conceal the presence of huge concessions. What Ewert does, avoiding all signs of what he’s doing, is to undertake salvage of Dembski’s treatment of LCI in Being as Communion: A Metaphysics of Information (2014). Rather than identify a source, he speaks from authority. Rather than replace terms that convey precisely the misconceptions in the book, he explains matter-of-factly that they don’t mean what they seem to say. And rather than admit that Felsenstein and I set him and his colleagues straight on the meanings, Ewert proclaims that “These Critics of Intelligent Design Agree with Us More Than They Seem to Realize.” The way he trumps up agreement is to treat a single section of our article, which merely reiterates an old point regarding the smoothness of fitness landscapes, as though it were the whole. We actually focus on an arbitrary, and hence arbitrarily rough, landscape.

LCI, putatively a law of nature, putatively has a mathematical foundation. According to Being as Communion (p. 148):

A precise theoretical justification for the claim that natural selection is inherently teleological comes from certain recent mathematical results known as Conservation of Information (CoI) theorems.

Now the claim is that natural selection is inherently biased, and that something must account for the bias — either design or the initial “configuration” of the Universe (wink wink, nudge nudge) — given that bias is conserved. In short, CoI still applies, with the understanding that I is for bIas. Dembski places his work in the context of earlier analysis of search, and mentions a sometime theorist you’ve heard of before (p. 151):

Computer scientist Thomas English, in a 1996 paper, also used the term “Conservation of Information,” though synonymously with the then recently proved results by Wolpert and Macready about No Free Lunch (NFL). In English’s version of NFL, “the information an optimizer gains about unobserved values is ultimately due to its prior information of value distributions.”

I actually proved an NFL theorem more general than that of Wolpert and Macready, and used the term “conservation of information” to characterize an auxiliary theorem. Although I got the math right, what I wrote about it in plain language was embarrassingly wrong. I happened to emend my online copy of the paper a month before Dembski’s book appeared, adding a preface titled “Sampling Bias Is Not Information.” So, while it definitely was Felsenstein who left Dembski et al. no choice but to abandon teleology, it may be that I had some influence on their choice of a new position. In any case, it falls to me to explain why they are embarrassingly wrong in what they claim about math that they have gotten right.

The right approach, for a general readership, is to address only what is most obviously wrong, and to put as much as possible into pictures. We’ll be looking at broken sticks. We’ll even watch them breaking randomly to pieces. This is how Dembski et al. see the biases of an evolutionary process being determined, in the absence of design. CoI tells us something about the random length of a particular segment, selected before the stick breaks. But Felsenstein and I selected an outcome after modeling the evolutionary process. We targeted an outcome for which the bias was large. The bias was not large because we targeted the outcome. Even if we pretend that a broken stick determined the bias of the evolutionary process, CoI does not apply. The theorem that does apply has no name. It is the solution to Exercises 666-667 in a highly respected text of the 19th Century, Choice and Chance. Given that it bears the Number of the Beast, and comes from the Reverend William Allen Whitworth, I’m tempted to call it the Revelation Theorem. But I’ll avoid giving offense, and refer instead to the Broken Stick Theorem.

Breaking sticks

Dembski et al. believe that CoI applies to all physical events that scientists target for investigation. The gist of their error is easy to understand. A scientist is free to investigate any event whatsoever after observing what actually occurs in nature. But the CoI theorem assumes that a particular event is targeted prior to the existence of a process. This is appropriate when an engineer selects a process in order to generate a prespecified event, i.e., to solve a given problem. It is no coincidence that the peer-reviewed publications of Dembski et al. are all in the engineering literature. The assumption of the theorem does not hold when a scientist works in the opposite direction, investigating an event that tends to occur in a natural process. Put simply, there is a difference between selecting a process to suit a given target and selecting a target to suit a given process. The question, then, is just how big the difference is. How badly wrong is it to say that the CoI theorem characterizes conservation of bias in nature? Fortunately, the error can be conveyed accurately with pictures. What we shall see is not conservation, but instead unbounded growth, of the maximum bias (“active information”).


ETA: Text between the horizontal rules is an improved introduction to the technical material, developed in discussion here at TSZ. It comes verbatim from a comment posted a week ago. I’ve made clear all along my intent to respond to feedback, and improve the post. However, I won’t remove any of the original content, because that’s too easily spun into a retraction.

Dembski et al. represent natural processes abstractly. In their math, they reduce the evolutionary process to nothing but the chances of its possible outcomes. The CoI theorem is indifferent to what the possible outcomes actually are, in physical reality, and how the process actually works, that the outcomes should have the chances of occurrence that they do. Here I assume that there are only 6 possible outcomes, arbitrarily named 1, 2, 3, 4, 5, 6. The possible outcomes could be anything, and their names say nothing about what they really are. Each of the possible outcomes has a chance of occurrence that is no less than 0 (sure not to occur) and no greater than 1 (sure to occur). The chances of the possible outcomes are required to add up to 1.

As far as the CoI theorem is concerned, an evolutionary process is nothing but a list of chances that sum to 1. I’ll refer to the list of chances as the description of the process. The first chance in the description is associated with the possible outcome named 1, the second chance in the description is associated with the possible outcome named 2, and so forth. The list

    \[.1, \quad .3, \quad .1, \quad .2, \quad .1, \quad .2\]

is a valid description because each of the numbers is a valid chance, lying between 0 and 1, and because the total of the chances is 1. We can picture the description of the evolutionary process as a stick of length 1, broken into 6 pieces.

[Need a new figure here.]

Naming the segments 1, 2, 3, 4, 5, 6, from left to right, the length of each segment indicates the chance of the possible outcome with the corresponding name. Consequently, the depiction of the evolutionary process as a broken stick is equivalent to the description of the process as a list of the chances of its possible outcomes.

You perhaps wonder how I would depict the evolutionary process as a broken stick if a “possible” outcome had absolutely no chance of occurring. And the answer is that I could not. There is no segment of length 0. In the CoI theorem, however, chances precisely equal to 0 are effectively impossible. Thus it is not misleading to say that Dembski et al. reduce the evolutionary process to a broken stick.

There are infinitely many ways to break our metaphorical stick into a given number of segments. Averaging over all of them, the lengths of the segments are

    \[\frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}, \quad \frac{1}{6}.\]

That is, in the average description of an evolutionary process, the possible outcomes are uniform in their chances of occurrence. Dembski et al. usually advocate taking uniform chances as the standard of comparison for all processes (though they allow for other standards in the CoI theorem). Dembski and Marks go much further in their metaphysics, claiming that there exist default chances of outcomes in physical reality, and that we can obtain knowledge of the default chances, and that deviation of chances from the defaults is itself a real and objectively measurable phenomenon. Although I want to limit myself to illustrating how they have gone wrong in application of CoI, I must remark that their speculation is empty, and comes nowhere close to providing a foundation for an alternative science. Otherwise, I would seem to allow that they might repair their arguments with something like the Broken Stick Theorem.

Taking uniform chance as the standard to which all evolutionary processes are compared, we naturally arrive at an alternative representation. We begin by writing the standard description a bit differently, multiplying each of the chances by 1.

    \[1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}, \quad 1 \times \frac{1}{6}.\]

Now we can write any description whatsoever by adjusting the multipliers, while leaving the fractions 1/6 just as they are. The trick is to multiply each of the chances in the description by 1, but with 1 written as 6 \times 1/6. For instance, the description

    \[\frac{1}{24}, \quad \frac{1}{3}, \quad \frac{1}{12}, \quad \frac{1}{4}, \quad \frac{1}{6}, \quad \frac{1}{8}\]

is equivalent to

    \[\frac{6}{24} \times \frac{1}{6}, \quad \frac{6}{3} \times \frac{1}{6}, \quad \frac{6}{12} \times \frac{1}{6}, \quad \frac{6}{4} \times \frac{1}{6}, \quad \frac{6}{6} \times \frac{1}{6}, \quad \frac{6}{8} \times \frac{1}{6}.\]

The multipliers

    \[\frac{6}{24}, \quad \frac{6}{3}, \quad \frac{6}{12}, \quad \frac{6}{4}, \quad \frac{6}{6}, \quad \frac{6}{8}\]

are the biases of the process, relative to the standard in which the chances are uniformly 1/6. The process is biased in favor of an outcome when the bias is greater than 1, and biased against an outcome when the bias is less than 1. For instance, the process is biased in favor of outcome 4 by a factor of 6/4 = 1.5, meaning that the chance of the outcome is 1.5 times as great as in the standard. Similarly, the process is biased against outcome 1 by a factor of 24/6 = 4, meaning that the chance of the outcome is 6/24 = 0.25 times as great as in the standard. The uniform standard is unbiased relative to itself, with all biases equal to 1.

The general rule for obtaining the biases of an evolutionary process, relative to the uniform standard, is to multiply the chances by the number of possible outcomes. With 6 possible outcomes, this is equivalent to scaling the the broken stick to a length of 6. We gain some clarity in discussion of CoI by referring to the biases, instead of the chances, of the evolutionary process. The process is metaphorically a broken stick, either way. Whether the segment lengths are biases or chances is just a matter of scale. We shall equate the length of the stick to the number of outcomes, and thus depict the biases of the process, for and against the possible outcomes corresponding to the segments.


To make the pictures clear, we assume that the evolutionary process has only 6 possible outcomes. Let’s name the possibilities 1, 2, 3, 4, 5, and 6. The process is unbiased if none of the possibilities has a greater chance of occurring than does any other, in which case the chance of each possible outcome is 1/6. According to Dembski et al., if we deny that the biases of the process are due to design, then we essentially say that a stick of length 6 broke randomly into 6 segments, and that the lengths of the segments determined the biases. Suppose that the length of the 3rd segment of the broken stick is 2. Then the evolutionary process is biased in favor of outcome 3 by a factor of 2. The chance of the outcome is

    \[2 \times \frac{1}{6} = \frac{1}{3}.\]

Suppose that the length of the 5th segment is 1/4. Then the process is biased against outcome 5 by a factor of 4, and the chance of the outcome is

    \[\frac{1}{4} \times \frac{1}{6} = \frac{1}{24}.\]

These biases are what Dembski et al. refer to as active information. The term, in and of itself, begs the question of whether something actively formed the process with bias in favor of a desired outcome.


ETA: Text between the horizontal rules comes from an earlier attempt at improving the introduction to the technical material, developed in discussion here at TSZ. I’ve quoted a comment posted 17 days ago.

Dembski et al. do not allow that such deviations from the supposedly “natural” default of uniform chance might be brute facts of physical reality. There must be a reason for bias. If we do not allow that bias is possibly due to design of the process to serve a purpose, then Dembski et al. force on us the view that bias itself arises by chance. (This is multifariously outrageous, but for reasons that are not clearly tied to their math.) That is, the chances of the possible outcomes of the evolutionary process are determined by an antecedent process, which is also random. Talk about the chances of chances gets very confusing, very fast. So I say instead that the evolutionary process is randomly biased by a process that occurs before it does. The biases of the evolutionary process are just the chances of the 6 possible outcomes of the evolutionary process, multiplied by 6. Setting the chances randomly is equivalent to setting the biases randomly.

The broken stick is a conventional metaphor for probabilities that are themselves set randomly. (I follow Dembski in reserving the word chance for the probability of a physically random outcome.) The random lengths of the segments of the stick are the probabilities. The stick is ordinarily of unit length, because the probabilities must sum to 1. To visualize random biases, instead of random chances, I need only multiply the length of the stick by the number of possible outcomes, 6, and randomly break the stick into 6 pieces. Then the biases sum to 6.

I stipulate that the biasing process, i.e., stick breaking, is uniform, meaning that all possible biases of the evolutionary process are equally likely to arise. A tricky point is that Dembski et al. allow for uniform biasing, but do not require it. The essential justification of my approach is that I need consider only something, not everything, that they allow in order to demonstrate that the theorem does not apply to scientific investigation. What I consider is in fact typical. The uniform biasing process is the average of all biasing processes. Thus there can be no objection to my choice of it.

Dembski et al. refer to all random processes as “searches.” The term is nothing but rhetorical assertion of the conclusion they want to draw. The stick-breaking “search” (process), which determines the biases of the evolutionary “search” (process), is a visualization of what they call a “search for a search.” Dembski et al. allow for the biasing process itself to be biased by an antecedent process, in which case there is a “search for a search for a search.” In Being as Communion, Dembski avoids committing to Big Bang cosmology, and indicates that the regress of searches for searches might go back forever in time. Fortunately, we need not enter a quasi-mystical quagmire to get at a glaring error in logic.


Animation 1. In the analysis of Dembski, Ewert, and Marks, the biases of an evolutionary process are like control knobs, either set by design, or set randomly by another process. The random biasing process is like a stick breaking into pieces. The biases of an evolutionary process are the lengths of the segments of a broken stick. Here the number of possible outcomes of the evolutionary process is 6, and a stick of length 6 breaks randomly into 6 segments. No segmentation is more likely than any other. Before the stick starts breaking, we expect any given segment to be of length 1. But when a scientist investigates an evolutionary process, the stick has already broken. The scientist may target the outcome for which the bias is greatest, i.e., the outcome corresponding to the longest segment of a broken stick. With 6 possible outcomes, the expected maximum bias is 2.45. Generalizing to n possible outcomes, the expected maximum bias of a randomly biased evolutionary process is a logarithmic function of n. The quantity grows without bound as the number of possible outcomes of evolution increases. The Conservation of Information Theorem of Dembski et al. tells us that the greater the bias in favor of an outcome specified in advance, the less likely the bias is to have arisen by breaking a stick, no matter how many the possible outcomes of the evolutionary process. It depends on an assumption that does not hold in scientific study of evolution.

In the most important case of CoI, all possible segmentations of the stick have equal chances of occurring. Although the segments almost surely turn out to be different in length, they are indistinguishable in their random lengths. That is, the chance that a segment will turn out to be a given length does not depend on which segment we consider. This is far from true, however, if the segment that we consider depends on what the lengths have turned out to be. Dembski et al. neglect the difference in the two circumstances when they treat their theorem as though it were a law of nature. Here’s an example of what CoI tells us: the probability is at most 1/2 that the first segment’s length will turn out to be greater than or equal to 2. More generally, for any given segment, the probability is at most 1/b that the segment’s length with turn out to be greater than or equal to b. This holds for sticks of all lengths n, broken into n segments. Recall that the random segment lengths are the random biases of the evolutionary process. CoI says that the greater the bias in favor of an outcome specified in advance, the less likely the bias is to have arisen by breaking a stick. The result is not useful in implicating design of biological evolution, as it assumes that an outcome was targeted in advance. To apply CoI, one must know not only that an outcome was targeted prior to the formation of the evolutionary process, but also which of the possible outcomes was targeted.3

Figure 2. In this frame from Animation 1, the segments of 20 broken sticks are colored according to their original positions. The expected length of each segment is 1, though the random lengths are highly variable. According to CoI, the probability is at most 1/2 that the length of the blue segment will turn out to be 2 or greater. More generally, for any given segment, the probability is at most 1/b that the length of the segment will turn out to be greater than or equal to b. This does not hold if we specify a segment in terms of the outcome of the random segmentation of the stick. In particular, CoI does not apply to the longest segment.

Figure 3. In this frame from Animation 1, the segments of each of the 20 broken sticks have been sorted into ascending order of length, and recolored. The expected length of the longest (red) segment is 2.45. By the Broken Stick Theorem, the probability is .728 that at least one of the segments is of length 2 or greater. By misapplication of CoI, the probability is at most 1/2. For a stick of length n, the probability is greater than 1/2 that at least one of the n segments exceeds \ln n in length. There is no limit on the ratio of probability 1/2 to the faux bound of 1/\ln n.

The Broken Stick Theorem tells us quite a bit about the lengths of segments. What is most important here is that, for any given length, we can calculate the probability that one or more of the segments exceeds that length. For instance, the probability is 1/2 that at least one of the segments is of length 2.338 or greater. If you were to misapply CoI, then you would say that the probability would be no greater than 1/2.338, which is smaller than 1/2. A simple way to measure the discrepancy is to divide the actual probability, 1/2, by the CoI bound, 1/2.338. The result, 1.169, is small only because the illustration is small. There is no limit on how large it can be for longer sticks. Let’s say that the stick is of length n, and is broken into n segments. Then the probability is greater than 1/2 that at least one of the segments exceeds \ln n in length. Here \ln n is the natural logarithm of n. The details are not important. What matters is that we can drive the faux bound of 1 / \ln n arbitrarily close to 0 by making n large, while the correct probability remains greater than 1/2.

Cool, but nonessential: The relation of the expected length of the i-th longest segment of a broken stick to the harmonic numbers. Here E[B_{(i)}] is the expected value of B_{(i)}, the i-th greatest of the random segment lengths (biases). As it happens, the notation E[\cdot], widely used in probability and statistics, was introduced by William Allen Whitworth, who derived the Broken Stick Theorem.

    \begin{align*} E[{B}_{(6)}] &= \frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6\\ E[{B}_{(5)}] &= \phantom{\frac{1}{1} +\;\,} \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_1\\ E[{B}_{(4)}] &= \phantom{\frac{1}{1} + \frac{1}{2} +\;\, } \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_2 \\ E[{B}_{(3)}] &= \phantom{\frac{1}{1} + \frac{1}{2} + \frac{1}{3} +\;\, } \frac{1}{4} + \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_3 \\ E[{B}_{(2)}] &= \phantom{\frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} +\;\, } \frac{1}{5} + \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_4 \\ E[{B}_{(1)}] &= \phantom{\frac{1}{1} + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} +\;\, } \frac{1}{6} = \mathcal{H}_6 - \mathcal{H}_5 \\ E[{B}_{(1)}] + \cdots + E[{B}_{(6)}] &= \frac{1}{1} + \frac{2}{2} + \frac{3}{3} + \frac{4}{4} + \frac{5}{5} + \frac{6}{6} = 6 \end{align*}

For large n, \mathcal{H}_n \approx \ln n + \gamma, where \gamma \approx 0.5772 is the Euler-Mascheroni constant. So the expected maximum bias (“active information”) of a randomly biased process is logarithmic in the number of possible outcomes. For large n,

    \[P(B_{(n)} > \ln n) \approx 1 - \frac{1}{e} \approx .6321.\]

The derivation is straightforward, but not brief. I decided that the loose bound

    \[P(B_{(n)} > \ln n) > \frac{1}{2}\]

better serves present purposes.

Rather than simply argue that the analysis of Dembski et al. does not apply, I have identified a comparable analysis that does apply, and used it to quantify the error in misapplying their analysis. The expected maximum bias (“active information”) for a randomly biased process (“search”) grows without bound as the size of the space of possible outcomes (“search space”) increases. For n possible outcomes, the probability is greater than 1/2 that the maximum bias exceeds \ln n. According to CoI, the probability is at most 1 / \ln n that the bias in favor of a given outcome is \ln n or greater. The discrepancy is entirely a matter of whether a possible outcome is targeted in advance of generating the process (“hit this”), or the most probable outcome of the process is targeted after the fact (“this is what it hits”). It should be clear that a scientist is free to do the latter, i.e., to investigate the most probable outcome of a process observed in nature.4 In Dembskian terms, the active information measure permits us to inspect the distribution of arrows shot into a wall by a blind archer, and paint a target around the region in which the density of arrows is greatest. There is no requirement that the target have the detachable specification that Dembski emphasized in his earlier writings.

Why a bug is not a weasel

In 1986, Richard Dawkins published The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design, a response to William Paley’s Natural Theology: or, Evidences of the Existence and Attributes of the Deity; Collected from the Appearances of Nature (1802). Dembski’s career in ID is largely a response to Dawkins. Indeed, the highlights are cooptations of ideas in The Blind Watchmaker. Dawkins characterizes objects that are complicated, and seemingly designed, as “statistically improbable in a direction that is specified not with hindsight.” Dembski elaborates on the self-same property in The Design Inference: Eliminating Chance through Small Probabilities (1998), taking it as information imparted to objects by design. A not-with-hindsight specification is, in his parlance, detachable from the specified event (set of objects), called the target. Dembski usually refers to complicatedness as complex specified information or specified complexity, but sometimes also as specified improbability. The last term gives the best idea of how it contrasts with active information, the elevated probability of an event not required to have a detachable specification. In No Free Lunch: Why Specified Complexity Cannot Be Purchased without Information, he states a Law of Conservation of Information for specified complexity. (As explained below, in Appendix 1, Dembski and Marks have never mentioned this LCI, since stating their LCI for active information.)

[This section is not complete. I expect to add the guts over the next day or so, which means, as Joe can tell you, that you should expect them sometime in December. The gist is that specified complexity does not apply to Dawkins’ Weasel program. Dembski has made much of the meaningfulness of the target sentence. But the fact of the matter is that the model is the same for all target sentences comprising 28 uppercase letters and spaces. The target need not be specified. The measure of active information formalizes Dawkins’ comparison the model to the proverbial monkeys at typewriters. It does not stipulate that the target have a detachable specification. Dembski and Marks seem to have thought that it was transparently obvious that Dawkins had selected a desired outcome in advance, and had informed (programmed) the evolutionary process to “hit the target.”]

Dembski discusses the Weasel program on pp. 176-180 of Being as Communion. Here is how he describes “search,” in general, and the Weasel program, in particular:

In The Blind Watchmaker, Dawkins purports to show how natural selection creates information. In that book, he gives his famous METHINKS IT IS LIKE A WEASEL computer simulation. A historian or literary scholar, confronted with the phrase METHINKS IT IS LIKE A WEASEL, would look to its human author, William Shakespeare, to explain it (the phrase is from Hamlet). An evolutionary theorist like Dawkins, by contrast, considers what it would take for an evolutionary process, simulated by an algorithm running on a computer, to produce this target phrase. All such algorithms consist of:

  1. an initialization (i.e., a place where the algorithm starts — for Dawkins the starting point is any random string of letters and spaces the same length as METHINKS IT IS LIKE A WEASEL);
  2. a fitness landscape (i.e., a measure of the goodness of candidate solutions — for Dawkins, in this example, fitness measures proximity to the target phrase so that the closer it is to the target, the more fit it becomes);
  3. an update rule (i.e., a rule that says where to go next given where the algorithm is presently — for Dawkins this involves some randomization to existing candidate phrases already searched as well as an evaluation of fitness along with selection of those candidates with the better fitness);
  4. a stop criterion (i.e., a criterion that says when the search has gone on long enough and can reasonably be ended — for Dawkins this occurs when the search has landed on the target phrase METHINKS IT IS LIKE A WEASEL).

Note that in these four steps, natural selection is mirrored in steps (2) and (3).

It is important to note that Dembski addresses algorithms, or designs of computer programs, in engineering terms, and does not address models (implemented by computer programs) in scientific terms. This amounts to a presumption, not a demonstration, that the computational process (running program) is designed to generate a desired outcome.

[What I hope to get across here is why Dembski et al. cannot misconstrue Felsenstein’s model, called the GUC Bug, as he does Dawkins’ model. Those of you who argue with ID proponents should put the tired old Weasel out to pasture, or wherever it is that old Weasels like to go, and give Felsenstein’s Killer Bug a try.]

Figure 4. Felsenstein’s GUC Bug model contrasts starkly with Dembski’s travesty of Dawkins’ Weasel program. There can be no argument that Felsenstein designed the model to hit a target, because we define the target in terms of the modeled process. The model implemented by the Weasel program is not terribly different. But it is terribly easy to brush aside the model, and focus upon the program. Then the claim is that Dawkins designed the program to hit a specified target with its output.

ID the future

It is telling, I believe, that Dembski gave a detachable specification, “teleological system/agent,” of the target for biological evolution in his seminar talk, and that Ewert gives a detachable specification of the event that he targets, birds, in his response to Felsenstein and me. Ewert addressed active information in his master’s thesis (2010), but developed a new flavor of specified complexity for his doctoral dissertation (2013; sequestered until 2018). He, Dembski, and Marks have published several papers on algorithmic specified complexity (one critiqued here, another here). Dembski indicates, in a footnote of Being as Communion, that he and Marks are preparing the second edition of No Free Lunch (presumably without changing the subtitle, Why Specified Complexity Cannot Be Purchased without Information). My best guess as to what to make of this is that Dembski et al. plan to reintroduce specification in LCI Version 3. One thing is sure: ever mindful of the next judicial test of public-school instruction in ID, they will not breathe a hint that their publications on active information are any less weighty than gold. Ewert has demonstrated some of the revisionary tactics to come.

Appendix 1: Contradictory laws on the books

There actually have been two Laws of Conservation of Information. The first, featured in No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence (Dembski, 2002), addresses the specified complexity, also known as the specified improbability, of an event. The second, featured in Being as Communion: A Metaphysics of Information (Dembski, 2014), addresses the active information of a process, supposedly necessary for “unnatural” elevation in probability of an event. Specified improbability is loosely the opposite of elevated probability. Dembski and Marks evidently saw better than to claim that both are conserved, as they have said nothing about the first law since coming up with the second. Although Dembski opens Being as Communion by indicating that it is the last book of a trilogy that includes No Free Lunch, his only mention of specified complexity is in a footnote listing examples of “materialist-refuting logic.” He also notes that he and Marks are preparing the second edition of No Free Lunch. To include both specified complexity and active information in the cutlery is to serve up free lunch. It equips the ID theorist to implicate design when an event is too improbable (relative to a probability induced by specification), and also when an event is too probable (relative to a probability asserted a priori).

Appendix 2: Remembrance of information past

Here I give ample evidence that the “search” really was supposed to search for the targeted event, and that “active information” really was supposed to account for its probability of success. I begin with two technical abstracts. If you find yourself getting bogged down, then read just the text I’ve highlighted. The first is for Dembski‘s seminar talk (August 2014).

Conservation of Information (CoI) asserts that the amount of information a search outputs can equal but never exceed the amount of information it inputs. Mathematically, CoI sets limits on the information cost incurred when the probability of success of a targeted search gets raised from p to q (p < q), that cost being calculated in terms of the probability p/q. CoI builds on the No Free Lunch (NFL) theorems, which showed that average performance of any search is no better than blind search. CoI shows that when, for a given problem [targeted event], a search outperforms blind search, it does so by incorporating an amount of information determined by the increase in probability with which the search outperforms blind search. CoI applies to evolutionary search, showing that natural selection cannot create the information that enables evolution to be successful, but at best redistributes already existing information. CoI has implications for teleology in nature, consistent with natural teleological laws mooted in Thomas Nagel’s Mind & Cosmos.

Apart from hiding a law of nature under a bushel, this is not much different from the abstract of “Life’s Conservation Law: Why Darwinian Evolution Cannot Create Biological Information” [sic] (Dembski and Marks, 2010; preprint 2008).

LCI characterizes the information costs that searches incur in outperforming blind search. Searches that operate by Darwinian selection, for instance, often significantly outperform blind search. But when they do, it is because they exploit information supplied by a fitness function — information that is unavailable to blind search. Searches that have a greater probability of success than blind search do not just magically materialize. They form by some process. According to LCI, any such search-forming process must build into the search at least as much information as the search displays in raising the probability of success. More formally, LCI states that raising the probability of success of a search by a factor of q/p (> 1) incurs an information cost of at least log(q/p). [… Conservation of information] theorems provide the theoretical underpinnings for the Law of Conservation of Information. Though not denying Darwinian evolution or even limiting its role in the history of life, the Law of Conservation of Information shows that Darwinian evolution is inherently teleological. Moreover, it shows that this teleology can be measured in precise information-theoretic terms.

The putative measure of teleology is log(q/p), the active information of the evolutionary search. Dembski also says in Being as Communion that a search is informed to find a target, not merely biased in favor of it.

A precise theoretical justification for the claim that natural selection is inherently teleological comes from certain recent mathematical results known as Conservation of Information (CoI) theorems [p. 148].

Simply put, searches, in finding targets output information. At the same time, to find targets, searches need to input information [p. 152].

CoI shows that successful search (i.e., one that locates a target) requires at least as much input of information as the search by its success outputs [p. 150].

The information that goes into formation of the search, to increase the probability that it finds the target, is active information. Returning to “Life’s Conservation Law” (Section 1, “The Creation of Information”):

Nature is a matrix for expressing already existent information. But the ultimate source of that information resides in an intelligence not reducible to nature. The Law of Conservation of Information, which we explain and justify in this paper, demonstrates that this is the case.

Dembski and Marks hold that the ultimate source of active information, which increases the probability that evolutionary search achieves a purpose, is supernatural intelligence. However, Ewert tells us that “active information,” regarded as bias instead of information, is not necessarily due to design.

The conservation of information does not imply a designer. It is not a fine-tuning argument. It is not our intent to argue that all active information derives from an intelligent source. To do any of those things, we’d have to introduce metaphysical assumptions that our critics would be unlikely to accept. Conservation of information shows only that whatever success evolutionary processes might have, it is due either to the original configuration of the universe or to design.

This reversal is not due to Ewert. He’s obviously adapting arguments in Being as Communion, though without citing a source.

Notes

1. Felsenstein and I give the bound for just the fittest of all genotypes. I’ve extended it to the set of all local maxima of the fitness landscape. We classify haploid organisms into genotypes according to their DNA bases in L positions of the genome. The neighbors of a genotype are the 3L genotypes that differ from it in exactly one of the L positions. We require that the fittest genotype of each neighborhood of K=3L+1 genotypes be unique. It follows immediately that at most one genotype per neighborhood is a local maximum of the fitness landscape, and that the ratio of the total number of genotypes to the number of local maxima is at least K. Evolution begins with a random genotype, proceeds along the path of steepest ascent on the landscape, and ends at a local maximum. The minimum degree of constraint on the final genotype in the process is K. This is also the minimum “active information” with respect to (targeting) the set of local maxima. That is, the probability is q = 1 that the final genotype is a local maximum. The uniform probability of the set of local maxima is p \leq 1/K. Finally, the active information, without conversion to a log scale, is q / p \geq K.

2. Although the term bias is technically acceptable — indeed, I have used it, and will continue to use it in contexts where constraint is inappropriate — Ewert earns scorn by abusing it in the most predictable of ways. The problem with referring to the bias of a natural process is that the general reader gets the idea that the process “naturally” ought to have behaved in some other way, and deviates only because something biased it. And thus the Designer enters through the back door, not by evidence or reason, but instead by rhetorical device. Usually, the meaning of bias is only that some of the possible outcomes of a process have different chances of occurring than do others. If this were always the case, then I would refer instead to the non-uniformity of the probability distribution on outcomes. By the way, I am not conflating all probabilities in scientific models with physical chances, as Dembski et al. generally do. Much of what is modeled as random in biological evolution is merely uncertain, not attributed to quantum chance. The vitally important topic of interpretations of probability, which Dembski deflects with a false analogy to interpretations of quantum mechanics (Being as Communion, p. 157), will have to wait for another post.

3. CoI applies more generally to events, meaning sets of possible outcomes. But that’s irrelevant to the logic, or lack thereof. For readers familiar with Dembski’s measure of specified complexity, I should mention that the measure of active information permits us to target any event whatsoever. There is no requirement that the event have a detachable specification. Dembski’s arguments to the effect that an event with a detachable specification might as well have been prespecified are irrelevant here.

4. What it means to investigate the most probable outcome of a process observed in nature is highly problematic. In particular, we generally cannot say anything sensible about the chances of possible outcomes of a one-shot process. Complex processes that have occurred once, and cannot be repeated, are what commonly interest evolutionary biologists. I should make it clear that I don’t agree with Dembski et al. that evolutionary biologists should make claims about the chances of this, that, and the other. I’m essentially playing along, to show that their math is not even applicable.

228 thoughts on “The Law of Conservation of Information is defunct

  1. Mung:
    Since there is no such thing as genetic information, who cares?

    You keep making cryptic comments in this vein, Mung. I don’t know why. Could uncloak and say what you actually mean?

  2. Elizabeth: So “Active Information” turns out to be the information that what might have happened did happen?

    LOL. How about “Stuff like that really does just happen”?

    You’re an expert, literally, on the kind of problems I have with attention. So you’ll understand that I have to try to focus on completing the post for now.

  3. Elizabeth: You keep making cryptic comments in this vein, Mung.I don’t know why.Could uncloak and say what you actually mean?

    I think Mung might be referring to something I said.

    I said I don’t think you can quantify biological information, and I said that because I don’t think you can tell (except through selection) whether a sequence is functional or anti-functional. And if flipping a bit provides a selection advantage in a specific context, it might be neutral or detrimental in another context.
    I just don’t think the analogy with data length and data storage works with bio-chemistry. Or at least the extent to which it works is very limited.

  4. GlenDavidson:
    Only one problem, to say that COI is “defunct” suggests that it was once at least somewhat functional.

    Belly laugh! I wish you’d say more. I always pay close attention to your remarks.

  5. There is a law that says “If you don’t want your findings disputed, write it long enough that no one would bother.”

  6. phoodoo: And be exactly right, it turns out!

    I’m betting that happens a lot in your life. You are exactly right about everything.

  7. phoodoo:
    There is a law that says “If you don’t want your findings disputed, write it long enough that no one would bother.”

    Most bibles are hundreds of pages long. So is Stephen Meyer’s book “Darwin’s Doubt”. Does your “law” apply there too?

  8. Adding my thanks to Tom for the huge effort he has expended in producing this post – it’s going to take a while for me to absorb.

  9. Goodness me, following Tom’s link to Panda’s Thumb I find an article by Professor Joe Felsenstein which reads like a less mathematically challenging introduction to Tom’s article. Worth a read for those of us who are a bit mathematically challenged. 😉

  10. Rumraket: Most bibles are hundreds of pages long. So is Stephen Meyer’s book “Darwin’s Doubt”. Does your “law” apply there too?

    “Twenty months out since the release of Darwin’s Doubt, the book continues to rank as a #1 bestseller in Paleontology at Amazon.com! …”

    I guess not.

  11. phoodoo: “Twenty months out since the release of Darwin’s Doubt, the book continues to rank as a #1 bestseller in Paleontology at Amazon.com! …”

    I guess not.

    Was that meant as a serious response to my question? Because I don’t see how the apparent selling power of the book impacts the question of it’s length being aimed at avoiding rebuttal. Could you clarify?

  12. Actually number 6, but phoodoo can be excused for slinging a bit of phoodoo.

    Oddly, it’s also number 6 in Creationism, behind Bill Nye’s Undeniable.

  13. I have yet another opportunity to make myself yet more odious to the ID community by suggesting information theory should be dropped or at least de-emphasized as a means of making the design argument,

    CSI-free Explanatory Filter…

    Simple is better.

    Example: 5 dominos standing on their edge rather than laying flat on a table.

    It’s at least suggestive of design on an intuitive level without appeal to the notion of information. That works for me. All the fancy math in these discussions muddles the issues, it doesn’t clarify them.

    Which ever side is right, I don’t care at this point. This is going down paths that don’t make the issues clearer.

    The OOL problem is about whether the first life is from a process typical or exceptional.

    The evolution of complexity after OOL is a whether the Rube Goldberg modifications are from events that are typical or exceptional.

    Those are the real issues, imho.

  14. stcordova: Those are the real issues, imho.

    Oddly enough there are scientist who take OOL seriously. Why does ID treat it like a fairy tale?

  15. petrushka,

    You mean its number six now? Still number six? How long has it been out? Wow, thanks for the perspective. That’s a hit!

    It was number one 20 months after it release!.

  16. What’s the correlation between book length and sales and what does that tell you? And what does that mean for your original point, Phoodoo?

  17. phoodoo:
    petrushka,
    You mean its number six now?Still number six?How long has it been out?Wow, thanks for the perspective.That’s a hit!
    It was number one 20 months after it release!.

    Darwin’s Doubt is two years old.

    For some perspective, Origin of Species is number 4 in Evolution after 155 years. Dawkins’ Selfish Gene is number 7 in Evolution after 30 years.

    Darwin’s Doubt is number 35 in Evolution.

  18. Oddly enough there are scientist who take OOL seriously. Why does ID treat it like a fairy tale?

    ID doesn’t treat it like a fairy tale, OOL would have to be an exceptional event by most standards of science. Or do you think complex machines like life arising by themselves are as typical a 100 dominos standing on their edge?

    The OP was about Conservation of Information (CoI). From my vantage point, I don’t make the CoI argument, and I suggest ID proponents trying to teach High Schoolers and College Freshman in whatever venue (like the kids I teach in church), not to use CoI or the 2nd law of Thermodynamics.

    I teach ID to kids as young as six years old. After a two hour session, even the six year old can say, “law of large numbers” and he can comprehend 20 fair coins won’t all be 100% heads if he pours them out of cup onto a table.

    I’m confident also he’ll never have to defend the law of large numbers since it is well accepted in science in math. I wouldn’t want him to try to defend CoI because I sure as heck wouldn’t know where to start and I have modest math skills, at least more than the average person who sits in my ID talks.

    The approach using the notion of expected outcomes or expected evolution of a system (like dropping dominos on a table), are good enough to get the notion across of what are typical vs. exceptional events given assumed boundary conditions.

  19. stcordova: Or do you think complex machines like life arising by themselves are as typical a 100 dominos standing on their edge?

    Atoms and molecules are not dominoes and do not have the same dynamics as dominoes, as you would know if you had devoted as much effort to understanding chemistry as you did thermodynamics.

    Until you make that effort there is no point in talking to you.

  20. Guys, this off-topic banter does something of a disservice to Dr. English’s efforts in offering a rebuttal to DEM that considers and criticises their ideas in a scholarly manner.

    I’m particularly disappointed that ID proponents here don’t feel inclined to respond to the substance. Perhaps the problem here is maybe there is no substantive response that can be made!

  21. Alan Fox: I’m particularly disappointed that ID proponents here don’t feel inclined to respond to the substance. Perhaps the problem here is maybe there is no substantive response that can be made!

    Ignorant twaddle is a response, and it reveals much. There will be no substantive response.

  22. Sal,

    Why not author a thread on OOL? Dr English is criticising DEM and their claims about evolutionary models. The thread is not about OOL

  23. stcordova,

    I teach ID to kids as young as six years old. After a two hour session, even the six year old can say, “law of large numbers” and he can comprehend 20 fair coins won’t all be 100% heads if he pours them out of cup onto a table.

    Do you then explain that this has nothing whatsoever to do with modern evolutionary theory or do you just continue to spread the lies and deliberate misconceptions of intelligent design creationism?

  24. Alan,

    Will do. I’ll give it a few days (or weeks) so that Dr. English and Dr. Felsenstein can have the floor. Good suggestion.

    Regarding your point about substance. I thought I pretty much stated my case, ID proponents should drop information arguments altogether or at least de-emphasize them. I wouldn’t want to parry with Dr. English on the issue.

    My silence on the matter is because I have no substantive rebuttal to what he said.

    Sal

  25. Patrick,

    Ask ten different scientists what the modern evolutionary theory is, and you will get ten different answers.

    That is why it is such a vacuous theory.

  26. phoodoo:
    Patrick,

    Ask ten different scientists what the modern evolutionary theory is, and you will get ten different answers.

    That is why it is such a vacuous theory.

    That’s a bit like saying: “ask 10 different doctors what the modern medical theory is and you’ll get 10 diferent answers”.

    That doesn’t make modern medical theory “vacuous”. It makes it full of detail and continuously growing.

  27. stcordova: I teach ID to kids as young as six years old. After a two hour session, even the six year old can say, “law of large numbers” and he can comprehend 20 fair coins won’t all be 100% heads if he pours them out of cup onto a table.

    This strikes me as very odd, Sal. Aside from the fact that it wouldn’t take most six year olds 2 hours to learn to say “law of large numbers” – and presuming that the six year old understood it (and I’ve done the same) – what has it got to do with ID?

  28. From the OP:

    It depends on an assumption that does not hold in scientific study of evolution.

    You seem to have been arguing that it does not hold for any of the sciences, period.

    It depends on an assumption that does not hold in scientific study of evolution.

    Is that because evolutionary theory is not predictive? What about population genetics?

  29. Mung: You seem to have been arguing that it does not hold for any of the sciences, period.

    Is this addressed to me? If so, could you explain what it refers to?

  30. phoodoo:
    petrushka,
    The question should be, WHY do atoms have dynamics, if no one gave them that?

    Why is there air?

    Bread goes down; toast comes up. You can’t explain that.

  31. If addressed to Tom, the context is this:

    The quantity grows without bound as the number of possible outcomes of evolution increases. The Conservation of Information Theorem of Dembski et al. tells us that the greater the bias in favor of an outcome specified in advance, the less likely the bias is to have arisen by breaking a stick, no matter how many the possible outcomes of the evolutionary process. It depends on an assumption that does not hold in scientific study of evolution.

    I’m not quite sure what Tom means here, but I think he’s saying that Dembski’s law depends on the assumption that the outcome has been specified in advance.

    Perhaps Tom could clarify.

  32. Mung: Is that because evolutionary theory is not predictive? What about population genetics?

    What do you think population genetics predicts?

  33. Folks, I’m just up from a 12-hour nap, and perhaps will do better today at finishing the incomplete section (which is commentary, not the heart of the matter).

    The post is long for two reasons. First, when you make the tightly coupled claims that Dembski and Marks replaced one LCI with another in 2008, and that LCI Version 2 is a misinterpretation of its supposed mathematical foundation (CoI), and that Dembski et al. have silently abandoned that misinterpretation, and that their new interpretation is wrong, you’ve got a lot of explaining to do. Second, you have to assume that most people understand none of what’s involved, and lay out the basic concepts that give them a fighting chance at grasping some of the substance.

    What you see here is a small fraction of what I have written. Some of what I wrote, I discarded because it took as correct the incorrect interpretations of probability by Dembski et al. Most of it, I put aside because it would make sense to few readers. Homing in on a relatively small subset of claims that are undeniably correct, and that might be expressed clearly (not to say that I have succeeded in doing so), is the hardest thing I’ve ever done. The task was perhaps not inherently hard, but it was incredibly hard for me, given the way my brain works (and does not work).

    Over the past six months, my thinking about how to reach the general reader has changed enormously. And I want to hear from you how I might have done better. One thing I believe is that making material accessible is not the same thing as making it simple. No matter what Dembski says, CoI is not simple. Only a “trust me” apologist would claim otherwise. What I’ve posted here is largely tutorial in character, and you have to work as a student would at making sense of it. If it actually were a tutorial, however, I’d have made it much longer than I did.

    Now, it is a given that I have not done the best I might have at explaining CoI. Thus there are no dumb questions. So please post questions as they come to mind. You’ll get interaction from me that is just as honest and open as this comment, not pontifical tripe like “Dr. Ewert Answers.” My goal is that we should learn from one another. I have never claimed to give the final answer on anything.

    I have to close this tab now, and try to focus on completing the post. Thanks in advance for your patience.

Leave a Reply