Algorithmic Specified Complexity and the Game of Life

Ewert, Dembski, and Marks have a forthcoming paper: “Algorithmic Specified Complexity and the Game of Life” – It appears to be behind paywalls though. Can anyone provide a copy?

Please note, any comments not directly addressing the math or mechanics of this post will moved to Guano (thanks Neil and Alan)

My earlier post:

1. In Conway’s Life: http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life
2. There is the Glider-Producing Switch Engine http://conwaylife.com/wiki/Glider-producing_switch_engine
3. It is coded by 123 “On Cells” but requires a space of 67×60 in a specific configuration.
4. That’s 4,020 bits, > UPB.
5. It contains well matched parts : 4bli,3blo,2bee,1boat,1loaf,1ship,1glider http://wwwhomes.uni-bielefeld.de/achim/moving.html
6. It occurs naturally out of randomly configured dust : http://wwwhomes.uni-bielefeld.de/achim/moving.html
7. It can evolve from a much smaller entity (“time bomb” – 17 active cells): http://conwaylife.appspot.com/pattern/timebomb

Possible criticisms:

Information is hidden somewhere
This is under “standard” Life rules (B3/S23) which means there is precious little exogenous information:

1.Any live cell with fewer than two live neighbours dies, as if caused by under-population.
2.Any live cell with two or three live neighbours lives on to the next generation.
3.Any live cell with more than three live neighbours dies, as if by overcrowding.
4.Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

(Source: http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life#Rules)

These are not self-replicating
This is not actually a requirement of Specified Complexity and it does send off some of its parts into the life universe.

Also interesting – some musings on how big a life universe might have to be to support self-replicating life: http://ieet.org/index.php/IEET/more/prisco20140915

106 thoughts on “Algorithmic Specified Complexity and the Game of Life

  1. Tom,

    Let’s write c(X) for EDM’s description of object X. They use the description length ℓ(c(X)) in place of K(d(X)|C) in calculating OASC. They are ignoring an additive constant corresponding to the length of the program that contains c(X), and arranges for program C to run on input of c(X).

    That’s why they should have used unconditional KC, as I argue above, even if you “allow” the direct execution of input by the machine.

    If you use unconditional KC with a machine that directly executes their description language, the length of the description becomes a valid lower bound on the KC.

    Of course they still run into this issue, but in regards to the choice of machine rather than the choice of C.

  2. Richardthughes, the second link’s for you.

    keiths:

    9. EDM look at only the ten most common ash objects from Flammenkamp’s list of 100. Anyone want to guess why?

    Top 100 of Game-of-Life Ash Objects

    Spontaneous appeared Spaceships out of Random Dust

    I was surprised by the CP-pulsar and the pentadecathlon when I first saw them, back in September. Looking more carefully now, I notice that the lower probability patterns generally seem less orderly than the higher probability ones.

  3. Tom,

    In response to your comment from a couple of days ago:

    Dembski has in fact subtracted apple bits from orange bits in some instantiations of specified complexity. I raised this issue in my first (bleary-eyed) response to ASC, back in the summer. After sleeping on it, I realized that K(X|C) corresponds to an algorithmic probability distribution…

    Ah, I see what you mean. So if you interpret K(X|C) as an algorithmic probability, then at least both terms in the ASC equation correspond to probabilities. I’m still doubtful that they’re commensurable, though.

    ASC is a log-ratio of unnormalized probabilities. The normalizing constants are additive under log transformation. I don’t know what to make of the exponentiation implied by your multiplicative constants.

    My point there was simply that if the two quantities are incommensurable, then there’s nothing special about subtracting one from the other. We could just as easily subtract a scaled version of one from a (differently) scaled version of the other — or using some other function entirely.

    To me it seems that EDM use the subtraction simply to capture the idea “lots of complexity (i.e. improbability) along with specification (low conditional KC) means high ASC -> design.” Subtracting a scaled KC from a scaled I(X) would seem to accomplish the same thing. Is there a motivation I’m overlooking for setting the scaling constants to 1 and just taking the difference between I(X) and K(X|C), which as you point out is a log ratio of unnormalized probabilities?

    In other words, do you see the log ratio as having special significance in capturing “meaningfulness”, versus other possible functions?

  4. Tom,

    I was surprised by the CP-pulsar and the pentadecathlon when I first saw them, back in September.

    Yes, the CP-pulsar caught my eye, too. I’m in the midst of an analysis of it.

  5. keiths:
    Tom,

    Yes, the CP-pulsar caught my eye, too.I’m in the midst of an analysis of it.

    Outside of the math, incredible symmetry and with it associated beauty. I can’t find any ancestors that evolve into it; there is “Pre-pulsar” but that is a different entity entirely.

  6. keiths:

    If you use unconditional KC with a machine that directly executes their description language, the length of the description becomes a valid lower bound on the KC.

    I get you now. The problem is that their description language cannot express all computable functions. It’s not Turing-complete, as Section III-D points out. There’s an easy fix, but I think it’s a cheat. You can get a new prefix-free set of programs by prepending 0 to all of the existing programs, prepending 1 to all of EDM’s descriptions, and taking the union of the set of modified programs and the set of modified descriptions. If the old computer is U and the new computer is U’, then U'(0x, y) = U(x, y) and U'(1x, y) = U(C, x).

    Dazed and confused… I need to take a long break. I’ll be back. (No tornadoes likely. I’m just a few miles away from Moore, OK.)

  7. Next flaw:

    11. With no justification, EDM introduce a “compressed encoding” that alters the probability distribution established by their “standard encoding”.

    There are two main encoding schemes in the Life paper. One describes patterns directly by specifying their configuration. This scheme is independent of function, and serves as the basis for the I(X) calculation. The other scheme describes the patterns indirectly, by specifying their function, and serves as the basis for the K(X|C) calculation (or technically, the K(X|C) upper bound).

    Of the first scheme, EDM write:

    We will model the patterns as being generated by a random sequence of bits.

    By this, they mean that each cell within the bounding box has a 50% chance of being initialized as alive.

    The encoding they use is straightforward, and is known as the “standard encoding”. It specifies a) the width of the bounding box, b) the height of the bounding box, and c) the alive/dead state of each cell, with one bit per celL

    Let’s assume that this encoding “works” — that is, that it accurately models the desired probability distribution (I don’t think it does, but I want to ponder that some more before writing about it). If so, then EDM should use this encoding to determine all of their I(X) values.

    Inexplicably, they don’t do this. Instead, in section III.B, they write:

    However, in many cases patterns consist of mostly dead cells. A lot of bits are spent indicating that a cell is empty. We can improve this situation by recording the number of live bits and then we can encode the actual pattern using less bits

    [Equation appears here]
    
    where pa is the number of alive cells. We will call this compressed encoding.

    This makes no sense to me. Why try to reduce the number of bits here? For the functional encoding, it would make sense — after all, that encoding is supposed to be as compressive as possible in order to approximate the true Kolmogorov complexity.

    But you don’t want this encoding to be compressive, because it is supposed to represent the true probability of the object arising randomly. By using a compressive encoding, EDM overstate the probability of the pattern (and understate its “complexity”), thereby messing up their OASC numbers.

  8. Flaw #12:

    12. EDM unnecessarily include width and height information in their “standard encoding”, which distorts both the I(X) and OASC(X) values.

    [Edited to correct the title: “distorts both the I(X) and OASC(X) values” used to be “distorts both the I(X) and K(X|C) values”.]

    Note: width and height are needed in the “compressed encoding”, but I argue in #11 above that the compressed encoding should not be used.

    I(X) is defined as -log2 P(X), where P(X) is the probability of the object X under the hypothesized distribution. An object is just a rectangular group of cells, some alive and some dead. If each cell’s probability of being alive is 50%, then the probability of getting any particular object is 1/2^n, where n is the total number of cells contained in the rectangular group.

    EDM don’t actually compute their I(X) values the correct way, however. Instead of setting I(X) equal to 1/2^n, they set it equal to 1/2^ℓ, where ℓ is the length of the entire standard encoding for that object. The standard encoding includes not only the n bits representing the cells, but also encodings of the width and height of the object.

    This is unnecessary and in fact detrimental. Width and height are not needed in the standard encoding, because the standard encoding does not get fed into the machine. The standard encoding is what comes out of the machine when the correct functional description is fed in.

  9. keiths,

    The description language allows for specification of arbitrary patterns. As I recall, EDM don’t say explicitly how patterns are encoded within descriptions. My guess is that they developed the pattern encoding for use in descriptions, and mindlessly reused it to calculate complexity.

    The encoding of arrays of bits is multifariously silly. The simplest, most superficial observation is that Levenshtein coding of strictly positive integers is inefficient.

  10. Richardthughes: Outside of the math, incredible symmetry and with it associated beauty. I can’t find any ancestors that evolve into it; there is “Pre-pulsar” but that is a different entity entirely.

    Whoops – here’s a precursor:

  11. Tom,

    My guess is that they developed the pattern encoding for use in descriptions, and mindlessly reused it to calculate complexity.

    That’s my guess, too.

    The simplest, most superficial observation is that Levenshtein coding of strictly positive integers is inefficient.

    Right. So to summarize:

    1. The width and height information is unnecessary.
    2. The width and height information is harmful and distorts the I(X) values.
    3. The Levenstein coding makes the problem worse.
    4. The Levenstein coding is unnecessary because the standard encoding is never used as input to the machine.

    In other words, they’ve unnecessarily and harmfully used an inefficient code to represent information that is already unnecessary and harmful.

    These guys are really something, aren’t they?

  12. Hey Richard,

    Good job finding that CP-pulsar precursor!

    I can use that in my analysis.

  13. On to the next item on the slop list:

    13. EDM misinterpret OASC as a probability.

    Of the Gosper glider gun, EDM write:

    The complexity is 196 bits. This gives us 196−111 = 85 bits of OASC. At a probability of 2^−85, we conclude the Gosper gun is unlikely to be produced by a random configuration.

    This is wrong. It is I(X), not OASC, that represents the probability of X.

  14. keiths,

    This is wrong. It is I(X), not OASC, that represents the probability of X.

    I’m not so sure. For me the central issue in all this is whether genotypes that score high on ASC can be reached by natural selection. And there the EDM paper(s) give no hint. I think that there is some implicit — very implicit — thought that it is hard or improbable to get genotypes that have large amounts of ASC. There is certainly no even-semi-coherent argument to this effect.

    An argument like this seems to be implied:

    If we have binary strings of length 1000, each has probability 2^(-1000) if we use coin tosses to construct the strings. (Ignoring all the various constants) there are 2^100 strings that can be output by programs that are of length 100 bits. These strings will have ASC of 900 bits. If we toss coins and make strings, the chance of making one that has an ASC value of 900 bits is then 2^100/2^1000 = 2^(-900), which is 2^(-ASC).

    Of course tossing coins is more like pure mutation, and it does not indicate that the strings are hard to produce by natural selection acting on mutations.

    Am I way off-base here?

  15. Joe,

    First a brief answer, then a longer explanation in a later comment.

    The brief answer:

    EDM are trying to answer the question “What is the probability of getting a Gosper glider gun?”

    You are trying to answer a different question, which is “What is the probability of getting a pattern with the same size and K(X|C) as the Gosper glider gun?”

    Or to put it in your coin-tossing context, EDM are trying to answer the question “What is the probability of getting sequence X?”

    You are asking, “What is the probability of getting a sequence with the same length and K(X|C) as X?”

  16. keiths:
    On to the next item on the slop list:

    13. EDM misinterpret OASC as a probability.

    Of the Gosper glider gun, EDM write:

    This is wrong.It is I(X), not OASC, that represents the probability of X.

    I think they’re alluding to Equation 4, which is correct, though derived incorrectly (k is not a constant). It may make more sense if you see how I reformulated and proved Equation 2. There’s a much simpler way to do it, though.

  17. Sorry, Tom — I think I misled you with a poorly worded title.

    Instead of this:

    13. EDM misinterpret OASC as a probability.

    I should have written this:

    13. EDM misinterpret OASC(X) as the probability of X.

    My description is still correct, however:

    Of the Gosper glider gun, EDM write:

    The complexity is 196 bits. This gives us 196−111 = 85 bits of OASC. At a probability of 2^−85, we conclude the Gosper gun is unlikely to be produced by a random configuration.

    This is wrong. It is I(X), not OASC, that represents the probability of X.

  18. 14. EDM make design inferences without stating or justifying an ASC threshold.

    In the abstract, EDM claim:

    To illustrate, we apply ASC to Conway’s Game of Life to differentiate patterns designed by programmers from those originating by chance.

    Nowhere do they state an ASC threshold, but let’s assume there is one and see what we can infer about it by observing when they do, and when they don’t, make design inferences. In a later comment, I’ll discuss how one might go about establishing such a threshold.

    In section IV-A (Oscillators), they make no design inferences. The largest OASC in table III is for the 29p9 oscillator, with 68.96 bits of OASC, so we can infer that their threshold is greater than that.

    In section IV-B (Spaceships), they write:

    The remaining diagonal spaceships exhibit a large amount of ASC, fitting the fact that they are all complex designs.

    The smallest OASC among those remaining spaceships is 221, so we can infer that their threshold is less than or equal to 221. We now know that the threshold is between 68.96 and 221.

    Regarding the orthogonal spaceships, they write:

    The simplest orthogonal spaceship, the lightweight spaceship, has negative bits of ASC. This matches the observation that these spaceships do arise out of random configurations. The remaining spaceships exhibit significant amounts of ASC, although not as much as the diagonal spaceships, and are not reported to have been observed arising at random.

    Here they seem to be hinting that the remaining spaceships are designed, without being willing to claim it explicitly. And since the smallest OASC values among the remaining spaceships are 39, 71, and 96, they are probably wise not to claim design, since they weren’t willing to do so for an oscillator with 68.96 bits of OASC.

    In section IV-C (Guns), they write:

    This gives us 196 – 111 = 85 bits of OASC. At a probability of 2^−85, we conclude the Gosper gun is unlikely to be produced by a random configuration.

    That’s a design inference, so we now know that their threshold is between 68.96 and 85 bits of OASC.

    In section IV-D (Eaters) and section IV-E (Ash Objects) they make no design inferences.

    All of this shows that:

    a) the paper is woefully incomplete, because it says nothing about how EDM decide, based on an ASC value, whether or not to make a design inference;

    b) EDM don’t state a threshold or show how to establish one;

    c) if the design inference isn’t based on a simple threshold comparison, they don’t show us what it is based on, and how to do it; and

    d) if it is based on a simple threshold comparison, then we can infer that the threshold is between 68.96 and 85 bits of ASC.

  19. 15. EDM’s P(X) is far more restrictive than they intend because it implicitly assumes that the Life universe is no bigger than the boundary box of X. This error artificially inflates their OASC values.

    In earlier comments, I’ve described how EDM’s P(X) values are distorted by a) the unnecessary inclusion of boundary box dimensions in their “standard encoding”, and b) the completely unwarranted use of the “compressed encoding”.

    They have made yet another mistake that causes them to massively understate P(X), leading to artificially high OASC values.

    The problem is that their standard encoding measures the probability of a particular pattern arising in a single boundary box. The size of the Life universe is much greater.

    In fact, Life is defined as occupying an infinite two-dimensional grid, which has the amusing consequence of making P(X) = 1 for any X! If you randomly initialize an infinite grid, every possible pattern will appear.

    If P(X) = 1, then I(X) = 0.

    ASC(X) = I(X) – K(X|C).

    Therefore, by EDM’s logic, ASC(X) is negative for all X, and the design inference is never warranted!

    How would we go about correcting their mistake? Well, it’s clear that an infinite universe is too large, so we could restrict ourselves to a finite grid, with suitable rule modifications for handling behavior at the edges. And clearly the grid needs to be larger than a boundary box, or P(X) will be artificially low.

    What is the “right” size? There is no definite answer. To make things worse, the probability is a function of both the pattern and the size of the universe. That is, for a given X, P(X) is higher in larger universes, meaning that ASC gets lower as the size of the universe increases.

    Thus, the ASC threshold beyond which design is inferred depends on the size of the universe. EDM have utterly failed to address this.

    I wrote earlier that the paper was “an abysmal mess”. I hope that by now, readers can see that I wasn’t exaggerating.

    This paper should never have been published.

  20. 16a. EDM’s neglect of the “physics” of the Life universe causes massive errors in their OASC values that lead to false design inferences.

    EDM’s crazy assumption has already been mentioned a few times in this thread:

    In the game, determining the probability of a pattern arising from a random configuration of cells is difficult. The complex interactions of patterns arising from such a random configuration makes it difficult to predict what types of patterns will eventually arise. It would be straightforward to calculate the probability of a pattern arising directly from some sort of random pattern generator. However, once the Game of Life rules are applied, determining what patterns would arise from the initial random pattern is nontrivial. In order to approximate the probabilities, we will assume that the probability of a pattern arising is about the same whether or not the rules of the Game of Life are applied, i.e., the rules of the Game of Life do not make interesting patterns much more probable than they would otherwise be.

    [Emphasis added]

    To demonstrate just how crazy that assumption is, let’s look at a couple of objects. Richardthughes tracked down simple precursors for two large objects, the CP-pulsar, which I’ll discuss here, and the glider-producing switch engine, which I’ll discuss in my next comment.

    The CP-pulsar is item #19 on this page, and its precursor is shown in this video at time 5:57.

    The Game of Life is deterministic, which means that a specific pattern X0 always leads to X1, which always leads to X2, and so on. In other words, once a pattern occurs, all of its descendants are inevitable.

    To determine the true value of P(X), one must therefore account not only for the probability of X arising randomly during initialization, but also the probability of all of its ancestors. EDM don’t do this because they can’t. Even in a simplified world such as the Life universe, the true probabilities are horrendously difficult to calculate, because determining all of the ancestors of an arbitrary given pattern is hopeless. Hence EDM’s crazy assumption.

    How much damage does this faulty assumption do? Consider that the simple precursor of the CP-pulsar fits into a 5×5 box. That means that in EDM’s standard encoding, the representation only requires 25 bits (after the unnecessary width and height information is stripped out — see flaw #12 above). Thus, the true P(X) value can be no less than 1/2^25.

    However, that simple precursor evolves into a 15×15 object, for which the (corrected) standard encoding requires 225 bits. Thus, EDM will attribute a probability of 1/2^225 to it.

    How big is the error? EDM’s methodology has underestimated the probability by a factor of 2^200.

    In other words, the true probability is over
    1,606,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000
    times as large as EDM’s methodology would indicate.

    That is far, far worse than mistaking the width of a human hair for the diameter of the observable universe.

    Does this translate into a bogus design inference? You bet.

    For the 15×15 object, the (corrected) I(X) is 225. Thus, K(X|C) would need to be at least 140 bits in order to prevent a design inference, given the ASC threshold that EDM are using (see flaw #14). K(X|C) is much smaller than that, so EDM will infer design.

    Yet the CP-pulsar is a “naturally occurring” object, meaning that it arises out of randomly-initialized “dust” (not surprising, because the precursor’s probability is better than 1 in 34 million).

    Therefore, EDM’s methodology leads to a bogus design inference.

  21. What is the “right” size?There is no definite answer.To make things worse, the probability is a function of both the pattern and the size of the universe. That is, for a given X, P(X) is higher in larger universes, meaning that ASC gets lower as the size of the universe increases.

    Thus, the ASC threshold beyond which design is inferred depends on the size of the universe.EDM have utterly failed to address this.

    I don’t see any calculation that asks about probabilities of arriving at a pattern by reproduction with mutation.

    Am I naïve to think that the calculation is in effect assuming just patterns arising due to mutation, with no analogue to natural selection?

  22. Joe,

    I don’t see any calculation that asks about probabilities of arriving at a pattern by reproduction with mutation.

    Am I naïve to think that the calculation is in effect assuming just patterns arising due to mutation, with no analogue to natural selection?

    It’s even worse than that.

    P(X) and K(X|C) are based only on whatever is in the bounding box. It’s as if EDM are randomly and separately generating the entire genome of one individual after another, with no interaction between individuals or between them and the world.

    There’s no analogue of reproduction, no analogue of a population, no analogue of natural selection. The real kicker is that there’s no analogue of physics.

    It’s “generate a random genome and see if you get lucky.”

    It’s ludicrous.

  23. Digital ‘tornado in a junkyard’, worse in fact; what are the odds of an XXXX being created de novo by the digital universe, despite the universe being an evolution program!

  24. If you thought flaw #16a was bad:

    16b. Using EDM’s methods, we find that “natural” causes can generate infinite amounts of ASC in the Game of Life. In other words, there is no ASC threshold that will avoid false positive design inferences.

    Like #16a, this happens because EDM ignore the effect of “physics” in the Game of Life.

    The ‘glider-producing switch engine’ is described here. It is a complicated entity that moves in one direction, spewing gliders in the direction of travel and leaving ash objects in its wake.

    The amazing thing is that this monstrosity can be produced by the tiny 15×6 “time bomb” precursor shown on the web page, whose small ASC is well within the reach of EDM’s random initialization process and does not trigger a design inference. It has also actually been seen in “the wild”.

    The pattern grows ever bigger over time, so I(X) (according to the standard encoding) is always increasing. K(X|C) is also increasing, but at a much slower rate, because the functional encoding is barely affected. Each iteration merely requires an increment of the ⊕ operator’s parameter. Since I(X) increases indefinitely, and always faster than K(X|C), this means that ASC(X) increases without bound.

    EDM not only have another false positive; they have one that they can’t get rid of, no matter how high they set their ASC threshold, unless they abandon their bogus “no physics” assumption and find a defensible replacement. Good luck to them.

    “Natural” causes can produce infinite ASC in the Game of Life, if we calculate ASC using EDM’s methods.

  25. Richardthughes:

    What’s that error in Dembskis, KeithS?

    You’ll never believe this, but it’s very close to one Dembski. No lie.

    Δ = | ln(erroneous measure) – ln(correct measure) | / 150
    = | ln(1/2^225) – ln(1/2^25)| /150
    = 0.924 Dembskis

  26. 17. EDM overlook the smallest object: one with no live cells.

    This is not a critical error, but it’s symptomatic of the general sloppiness of the paper.

    Still life patterns are those that do not change from one iteration to the next under the rules of Life. In EDM’s functional encoding, a still life’s unchanging nature is captured by the equation

    X = X⊕

    where the ⊕ operator indicates a single application of the rules of the Game of Life to the specified state. In other words, the equation is describing a pattern X that remains exactly the same when the rules of Life are applied to it. Since the rules are deterministic, a pattern that remains the same after one iteration will remain the same forever — a still life.

    Of course, there isn’t just a single pattern X that remains the same when the rules are applied to it — there are infinitely many. To designate a specific pattern, a number is appended to the equation. For example,

    X = X⊕,#5

    designates the sixth pattern (the numbers start at #0) that satisfies the equation X = X⊕, when all possible patterns are considered from smallest to largest according to an unambiguous set of lexicographic rules.

    Thus,

    X = X⊕,#0

    designates the smallest still life.

    EDM misidentify this as the block pattern, consisting of four live cells packed together in a square. In reality it is the null pattern, consisting of no live cells at all, that satisfies

    X = X⊕,#0 .

    The block pattern is described by

    X = X⊕,#1 .

  27. 18. EDM wrongly claim that “any pattern in the Game of Life can be constructed by colliding gliders.”

    The bogus claim is in section IV-A.

    The citation they provide ([46]) is David Salomon’s book Variable-length Codes for Data Compression. I see no entry in the index for “glider” or “Life”, and no indication that the book even discusses these topics (why would it?). EDM also provide no page numbers. Grrrrr…

    I’m guessing that this was a mis-cite, and that the Salomon book was actually their source for the Levenstein coding and other compression ideas.

    I looked at the nearby citations to see if any of them might be the source of the claim. One of them, number [47], is Collision-based Computing, edited by Andrew Adamatzky. That sounded promising, but despite containing multiple discussions of glider collisions, it says nothing remotely like what EDM are claiming. And again EDM provide no page numbers.

    FFS. What is wrong with these people?

    After giving up on their citations, I remembered that I had seen the phrase “Garden of Eden pattern” during my Game of Life researches — a name that seems to imply “no precursors”. Sure enough, this page confirms it:

    A Garden of Eden is a pattern that has no parents and thus can only occur in generation 0. The term was first used in connection with cellular automata by John W. Tukey, many years before Conway’s Game of Life was conceived. It was known from the start that Gardens of Eden exist in Life because of a theorem by Edward Moore that guarantees their existence in a wide class of cellular automata.

    A pattern with no precursors obviously cannot be constructed via glider collisions, so EDM’s claim is false.

  28. Behe actually tries to argue that drug resistant malaria is a garden of Eden configuration. I like that term. Unsolvable is a more useful concept than irreducible.

  29. 19. EDM claim that “Algorithmic specified complexity (ASC) measures the degree to which an object is meaningful.”

    In the description of flaw #6, I pointed out one problem with that claim:

    Okay, let’s collect a bunch of ASC values (or OASC values, which are lower bounds for the true ASC values), then collect the meaningfulness values for the same objects, and test the strength of the correlation.

    Oops, we can’t do that, because no one knows what the “meaningfulness” values are. That’s what ASC is supposed to provide us: the first viable tool for measuring the degree of meaningfulness of an object.

    It gets worse, though, because ASC doesn’t even correspond to our intuitive sense of what is and isn’t meaningful.

    Following EDM’s example in section II, let’s consider two texts. Text A is a sentence from Hamlet:

    So you must take your husbands.

    Text B is a much longer random sequence of words selected from Hamlet:

    Hart mortised a with it weasel him my and much darest cozen’d the a command sure to doth nothing reechy mouse Rosencrantz a mayst hectic worm sir humbly…

    Under EDM’s standard assumption of a uniform distribution, the shorter Text A is vastly more probable than the longer Text B, so

    -log2 P(B) > -log2 P(A)

    and

    I(B) > I(A).

    Let’s use EDM’s suggested compression algorithm on the two texts. They propose creating an exhaustive, frequency-ordered table of all the words in Hamlet, and compressing the texts by replacing each word with the index of the corresponding entry in the table.

    Since text B is much longer than text A,

    K(B|C) > K(A|C).

    And because of the compression, K(B|C) will increase more slowly than I(B) as the length of text B increases.

    Therefore, no matter what value ASC(A) has…

    ASC(A) = I(A) – K(A|C)

    …ASC(B) will be greater if B is sufficiently long, since I(B) will grow faster than K(B|C).

    Therefore, by EDM’s faulty logic, a well-formed sentence is less meaningful than a sufficiently long random sequence of words, because the ASC of the former is less than the ASC of the latter.

    Our intuition tells us otherwise, demonstrating that ASC fails to capture what we mean by “meaningful”.

    Now, EDM could try to address the problem by proposing a context C that is sensitive to grammar and other correlates of intuitive meaningfulness, but then they would be violating the principle they laid out in Algorithmic Specified Complexity:

    4.2 Context is Subjective

    The ASC of any object will depend on the context chosen. Any object can be made to have high ASC by using a speci cally chosen context. But this appears to be the way that information works. If the authors, who do not understand Arabic, look at Arabic text, it appears to be no better then scribbling. The problem is not that Arabic lacks information content, but that we are unable to identify it without the necessary context. As a result, this subjectivity appears to capture something about the way information works in the human experience.

    As with speci cation, it is important the context be chosen independent of object under investigation. While a speci cation will rarely be independent of the object under investigation, we believe it is much easier to maintain this independence in the case of a context.

    [Typos in original; emphasis added]

    By picking a context based on the nature of the two texts, EDM would in effect be drawing the target around the head of the arrow — a sin they already commit in their particular choice of a compressed encoding for the Life patterns.

  30. I’m not going to number this one, because it isn’t substantive and doesn’t affect EDM’s arguments. It’s yet another indicator of their deplorable lack of discipline, however.

    In Section II, EDM make multiple references to “elite programs” and cite Gregory Chaitin as the source for that term’s technical definition. Yet Chaitin doesn’t use that term.

    The actual term is “elegant program”, which EDM could have discovered by opening the frikkin’ book.

    ETA: Which they didn’t do because they were too lazy to put page numbers in the citation (and a bunch of others). Nice job, guys.

  31. keiths,

    The gist of the bizarreness is that EDM exploit perfect knowledge of the physics of Life in description of patterns, and pretend that the probability of occurrence of patterns has nothing to do with physics.

    I don’t agree with taking P(X) to be the probability of generating X in total isolation. But let’s play along with EDM for now.

    Each pattern X can be described as the result of running Life for n time steps on an initial pattern X’. Choose X’ to minimize the length ℓ of an encoding of (X’, n), and let the complexity of X in isolation be I(X) = ℓ. Making I(X) computable is no problem. Approximating I(X) in polynomial time, with known bounds on error, is quite another matter.

    (Having thought more about this, I’m inclined to say that the chance of X arising in isolation has nothing to do with how long it takes for the precursor X’ to yield X. How to enumerate precursor patterns is debatable.)

    EDM would rather calculate nonsense than admit that they cannot, as a practical matter, calculate something sensible.

  32. 20. EDM’s use of the word “complexity” is idiosyncratic, clashing both with normal usage and with our intuitions.

    In #19, I explained how algorithmic specified complexity fails to capture what we mean by “meaningful”, contrary to EDM’s claims.

    A similar complaint applies to their use of “complexity”. For EDM, an object’s “complexity” is measured by I(X). Yet I(X) is merely a log-transformed probability, so when EDM dub an object “complex”, they are merely saying that it is improbable, which is not at all the same.

    Their equivocation leads to absurd consequences:

    1. Imagine a 40×40 square in the Game of Life with no live cells. Completely blank. According to EDM, this blank square is extremely complex. After all, its probability is only 1 in 2^1600 (given the probability distribution used in this paper), meaning that its I(X) value is a whopping 1600 bits.

    2. Now imagine an intricate 40×40 object containing lots of live cells and performing a “function”. Its probability is exactly the same as for the blank grid — 1 in 2^`1600. So by EDM’s logic, this intricate object is no more complex than the blank grid. The complexity is exactly the same.

    This problem dates back to Dembski’s formulation of ‘specified complexity’ and ‘complex specified information’ in the late ’90s.

    I’ve often wondered why Dembski didn’t call them ‘specified improbability’ and ‘improbable specified information’. My guess is that he considered the more accurate terms to be less marketable. Also, the circularity of using SC or CSI to infer design becomes more obvious when the word ‘improbable’ is explicit.

  33. Tom,

    The gist of the bizarreness is that EDM exploit perfect knowledge of the physics of Life in description of patterns, and pretend that the probability of occurrence of patterns has nothing to do with physics.

    It’s hard to overstate the ridiculousness of their “no-physics” assumption, isn’t it? Think of the large number of patterns that “evaporate” and leave a blank grid. The blank grid has a huge basin of attraction, yet EDM assume that it is no more probable than any of its predecessors.

    I don’t agree with taking P(X) to be the probability of generating X in total isolation. But let’s play along with EDM for now.

    Yes, that was number 15.

    EDM would rather calculate nonsense than admit that they cannot, as a practical matter, calculate something sensible.

    And it gets even worse when you leave the simplified world of Life behind. Imagine trying to obtain accurate P(X) values in a real-world evolutionary scenario! It’s Dembski’s P(T|H) problem all over again.

    ASC is as useless as CSI in the evolution debates.

  34. 21. By the logic of EDM’s paper, a large blank grid is far more “meaningful” than one of the same size in which four gliders are carefully arranged to trace out the edges of a diamond pattern.

    The I(X) of a large blank grid is high, as discussed here. Its K(X|C) is low (for EDM’s choice of C), as it can be represented by

    X = X⊕,#0 .

    High I(X) with low K(X|C) means high ASC for the blank grid.

    Meanwhile, the glider formation I describe above has the same I(X) as the blank grid, but it cannot be compressed easily using EDM’s encoding. While their encoding scheme can compactly represent a single moving object, it inefficiently represents a multiplicity of moving objects (unless they’re all moving at the same speed, in the same direction, and with the same period). Thus the K(X|C) is high.

    I(X) is the same as for the blank grid, but K(X|C) is much higher. Therefore ASC is much lower, and EDM’s methodology tells us that the blank grid is far more meaningful than the coordinated glider formation.

  35. Tom English,

    (Having thought more about this, I’m inclined to say that the chance of X arising in isolation has nothing to do with how long it takes for the precursor X’ to yield X. How to enumerate precursor patterns is debatable.)

    EDM would rather calculate nonsense than admit that they cannot, as a practical matter, calculate something sensible.

    Are you planning to compose a response to the IEEE journal where this was published? It seems that their review process could be significantly improved.

  36. Patrick,

    A “response to” note would have to be concise and obviously substantive. EDM should not be able to respond with a simple modification of ASC. Criticizing their definition of I(X) doesn’t work, because they acknowledge in the paper that it’s weak. I don’t see how to proceed at the moment.

  37. Tom,

    A “response to” note would have to be concise and obviously substantive. EDM should not be able to respond with a simple modification of ASC. Criticizing their definition of I(X) doesn’t work, because they acknowledge in the paper that it’s weak. I don’t see how to proceed at the moment.

    To my mind, any flaw that invalidates the conclusions of the paper and cannot be fixed in a way that re-validates those conclusions ought to count as ‘obviously substantive’.

    EDM’s I(X) values are flawed, their K(X|C) values are flawed, and their OASC values are flawed, so their conclusions are invalid.

    The fix for I(X) is to use the true probability, but EDM can’t compute it.

    The fix for K(X|C) is to allow some contexts but rule out others, but on what basis? The only criterion that EDM offer for this purpose is one that their own encoding violates.

    OASC can’t be fixed unless I(X) and K(X|C) are fixed.

    In my opinion, this paper is a train wreck that can’t be salvaged at all, much less by any simple modifications.

  38. Tom English,

    A “response to” note would have to be concise and obviously substantive. EDM should not be able to respond with a simple modification of ASC. Criticizing their definition of I(X) doesn’t work, because they acknowledge in the paper that it’s weak. I don’t see how to proceed at the moment.

    Isn’t the fact that they ignore the rules of Life and assume that they can calculate probabilities based on random distributions sufficient? That’s the equivalent of ignoring known evolutionary mechanisms when they make their bogus probability calculations in a biological context.

  39. 22. EDM incorrectly claim that ASC measures meaningfulness in a way that Kolmogorov complexity cannot.

    Their error is easy to see in the context of the Game of Life. Consider all possible 40×40 patterns. Each pattern Xn has an I(Xn) of 1600 bits (see #20).

    Since I(Xn) is the same for all n, any differences in ASC will be solely due to differences in K(Xn|C). Thus any differences in “meaningfulness” derive from K(Xn|C), not from ASC. Including I(Xn) does nothing to make ASC more accurate as a measure of meaning.

    What about cases in which I(Xn) differs from pattern to pattern? ASC still doesn’t work in such cases, as I explained in #19.

  40. Patrick:

    Isn’t the fact that they ignore the rules of Life and assume that they can calculate probabilities based on random distributions sufficient?That’s the equivalent of ignoring known evolutionary mechanisms when they make their bogus probability calculations in a biological context.

    EDM acknowledge weakness in that aspect of the study. There has to be an “angle” that compels the editor to allow the response to go through review. I think I have one now, but I’m not so sure that I have a knockout punch. Sorry, but I won’t be discussing this in public.

  41. Tom,

    EDM acknowledge weakness in that aspect of the study.

    What they don’t acknowledge is that the weakness is fatal to their conclusions, as this demonstrates.

    In fact, they claim that their approach “appears to work”.

    If they can’t repair the flaw — and I don’t see how they can — then their conclusions and claims can’t be salvaged. That should certainly justify the publication of a response!

  42. Tom English,

    Isn’t the fact that they ignore the rules of Life and assume that they can calculate probabilities based on random distributions sufficient?That’s the equivalent of ignoring known evolutionary mechanisms when they make their bogus probability calculations in a biological context.

    EDM acknowledge weakness in that aspect of the study. There has to be an “angle” that compels the editor to allow the response to go through review. I think I have one now, but I’m not so sure that I have a knockout punch. Sorry, but I won’t be discussing this in public.

    I’m looking forward to seeing your punch.

    With regard to ignoring the rules of Conway’s Life, my understanding is that EDM are saying the rules don’t affect the probability of seeing particular patterns. This is clearly incorrect on it’s face. I wonder if it would be useful to run a few simulations to empirically demonstrate the difference in patterns between random noise and Life after some number of generations.

    Of course, they’d retreat to the usual creationist position of “Who are you going to believe, me or that lying evidence?” It might trigger the IEEE to review papers more carefully, though.

  43. Patrick,

    With regard to ignoring the rules of Conway’s Life, my understanding is that EDM are saying the rules don’t affect the probability of seeing particular patterns. This is clearly incorrect on it’s face.

    For an empty 40×40 grid, they’re off by a factor of thousands.

    For the CP-pulsar they’re off by a factor of more than 1,606,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000.

    For the glider-producing switch engine, they’re off by an unbounded amount.

Leave a Reply