Things That IDers Don’t Understand, Part 1 — Intelligent Design is not compatible with the evidence for common descent

Since the time of the Dover trial in 2005, I’ve made a hobby of debating Intelligent Design proponents on the Web, chiefly at the pro-ID website Uncommon Descent. During that time I’ve seen ID proponents make certain mistakes again and again. This is the first of a series of posts in which (as time permits) I’ll point out these common mistakes and the misconceptions that lie behind them.

I encourage IDers to read these posts and, if they disagree, to comment here at TSZ. Unfortunately, dissenters at Uncommon Descent are typically banned or have their comments censored, all for the ‘crime’ of criticizing ID or defending evolution effectively. Most commenters at TSZ, including our blog host Elizabeth Liddle and I, have been banned from UD. Far better to have the discussion here at TSZ where free and open debate is encouraged and comments are not censored.

The first misconception I’ll tackle is a big one: it’s the idea that the evidence for common descent is not a serious threat to ID. As it turns out, ID is not just threatened by the evidence for common descent — it’s literally trillions of times worse than unguided evolution at explaining the evidence. No exaggeration. If you’re skeptical, read on and I’ll explain.

Continue reading

The LCI and Bernoulli’s Principle of Insufficent Reason

(Just found I can post here – I hope it is not a mistake. This is a slightly shortened version of a piece which I have published on my blog. I am sorry it is so long but I struggle to make it any shorter. I am grateful for any comments. I will look at UD for comments as well – but not sure where they would appear.).

I have been rereading Bernoulli’s Principle of Insufficient Reason and Conservation of Information in Computer Search by William Dembski and Robert Marks. It is an important paper for the Intelligent Design movement as Dembski and Marks make liberal use of Bernouilli’s Principle of Insufficient Reason (BPoIR) in their papers on the Law of Conservation of Information (LCI).  For Dembski and Marks BPoIR provides a way of determining the probability of an outcome given no prior knowledge. This is vital to the case for the LCI.

The point of Dembski and Marks paper is to address some fundamental criticisms of BPoIR. For example  J M Keynes (along with with many others) pointed out that the BPoIR does not give a unique result. A well-known example is applying BPoIR to the specific volume of a given substance. If we know nothing about the specific volume then someone could argue using BPoIR that all specific volumes are equally likely. But equally someone could argue using BPoIR all specific densities are equally likely. However, as one is the reciprocal of the other, these two assumptions are incompatible. This is an example based on continuous measurements and Dembski and Marks refer to it in the paper. However, having referred to it, they do not address it. Instead they concentrate on the examples of discrete measurements where they offer a sort of response to Keynes’ objections. What they attempt to prove is a rather limited point about discrete cases such as a pack of cards or protein of a given length. It is hard to write their claim concisely – but I will give it a try.

Imagine you have a search space such as a normal pack of cards and a target such as finding a card which is a spade. Then it is possible to argue by BpoIR that, because all cards are equal, the probability of finding the target with one draw is 0.25. Dembski and Marks attempt to prove that in cases like this that if you decide to do a “some to many” mapping from this search space into another space then you have at best a 50% chance of creating a new search space where BPoIR gives a higher probability of finding a spade. A “some to many” mapping means some different way of viewing the pack of cards so that it is not necessary that all of them are considered and some of them may be considered more often than others. For example, you might take a handful out of the pack at random and then duplicate some of that handful a few times – and then select from what you have created.

There are two problems with this.

1) It does not address Keynes’ objection to BPoIR

2) The proof itself depends on an unjustified use of BPoIR.

But before that a comment on the concept of no prior knowledge.

The Concept of No Prior Knowledge

Dembski and Marks’ case is that BPoIR gives the probability of an outcome when we have no prior knowledge. They stress that this means no prior knowledge of any kind and that it is “easy to take for granted things we have no right to take for granted”.  However, there are deep problems associated with this concept. The act of defining a search space and a target implies prior knowledge. Consider finding a spade in pack of cards. To apply BPoIR at minimum you need to know that a card can be one of four suits, that 25% of the cards have a suit of spades, and that the suit does not affect the chances of that card being selected. The last point is particularly important. BPoIR provides a rationale for claiming that the probability of two or more events are the same. But the events must differ in some respects (even if it is only a difference in when or where they happen) or they would be the same event. To apply BPoIR we have to know (or assume) that these differences are not relevant to the probability of the events happening. We must somehow judge that the suit of the card, the head or tails symbols on the coin, or the choice of DNA base pair is irrelevant to the chances of that card, coin toss or base pair being selected. This is prior knowledge.

In addition the more we try to dispense with assumptions and knowledge about an event then the more difficult it becomes to decide how to apply BPoIR. Another of Keynes’ examples is a bag of 100 black and white balls in an unknown ratio of black to white. Do we assume that all ratios of black to white are equally likely or do we assume that each individual ball is equally likely to be black or white? Either assumption is equally justified by BPoIR but they are incompatible. One results in a uniform probability distribution for the number of white balls from zero to 100; the other results in a binomial distribution which greatly favours roughly equal numbers of black and while balls.

Looking at the problems with the proof in Dembski and Marks’ paper.

The Proof does not Address Keynes’ objection to BPoIR

Even if the proof were valid then it does nothing to show that the assumption of BPoIR is correct. All it would show (if correct) was that if you do not use BPoIR then you have 50% or less chance of improving your chances of finding the target. The fact remains that there are many other assumptions you could make and some of them greatly increase your chances of finding the target. There is nothing in the proof that in anyway justifies assuming BPoIR or giving it any kind of privileged position.

But the problem is even deeper. Keynes’ point was not that there are alternatives to using BPoIR – that’s obvious. His point was that there are different incompatible ways of applying BPoIR. For example, just as with the example of black and white balls above, we might use BPoIR to deduce that all ratios of base pairs in a string of DNA are equally likely. Dembski and Marks do not address this at all. They point out the trap of taking things for granted but fall foul of it themselves.

The Proof Relies on an Unjustified Use of BPoIR

The proof is found in appendix A of the paper and this is the vital line:

image

This is the probability that a new search space created from an old one will include k members which were part of the target in the original search space. The equation holds true if the new search space is created by selecting elements from old search space at random; for example, by picking a random number of cards at random from a pack. It uses BPoIR to justify the assumption that each unique way of picking cards is equally likely. This can be made clearer with an example.

Suppose the original search space comprises just the four DNA bases, one of which is the target. Call them x, y, z and t. Using BPoIR, Dembski and Marks would argue that all of them are equally likely and therefore the probability of finding t with a single search is 0.25. They then consider all the possible ways you might take a subset of that search space. This comprises:

Subsets with

no items

just one item: x,y,z and t

with two items: xy, xz, yz, tx, ty, tz

with three items: xyz, xyt, xzt, yzt

with four items: xyzt

A total of 16 subsets.

Their point is that if you assume each of these subsets is equally likely (so the probability of one of them being selected is 1/16) then 50% of them have a probability of finding t which is greater than or equal to probability in the original search space (i.e. 0.25). To be specific new search spaces where probability of finding t is greater than 0.25 are t, tx, ty, tz, xyt, xzt, yzt and xyzt. That is 8 out of 16 which is 50%.

But what is the justification for assuming each of these subsets are equally likely? Well it requires using BPoIR which the proof is meant to defend. And even if you grant the use of BPoIR Keynes’ concerns apply. There is more than one way to apply BPoIR and not all of them support Dembski and Marks’ proof. Suppose for example the subset was created by the following procedure:

    • Start with one member selected at random as the subset
    • Toss a dice,
      • If it is two or less then stop and use current set as subset
      • If it is a higher than two then add another member selected at random to the subset
    • Continue tossing until dice throw is two or less or all four members in are in subset

This gives a completely different probability distribution.

The probability of:

single item subset (x,y,z, or t) = 0.33/4 = 0.083

double item subset (xy, xz, yz, tx, ty, or tz) = 0.66*0.33/6 = 0.037

triple item subset (xyz, xyt, xzt, or yzt) = 0.66*0.33*0.33/4 = 0.037

four item subset (xyzt) = 0.296

So the combined probability of the subsets where probability of selecting t is ≥ 0.25 (t, tx, ty, tz, xyt, xzt, yzt, xyzt) = 0.083+3*(0.037)+3*(0.037)+0.296 = 0.60 (to 2 dec places) which is bigger than 0.5 as calculated using Dembski and Marks assumptions. In fact using this method, the probability of getting a subset where the probability of selecting t ≥ 0.25 can be made as close to 1 as desired by increasing the probability of adding a member. All of these methods treat all four members of the set equally and are equally justified under BpoIR as Dembski and Marks assumption.

Conclusion

Dembski and Marks paper places great stress on BPoIR being the way to calculate probabilities when there is no prior knowledge. But their proof itself includes prior knowledge. It is doubtful whether it makes sense to eliminate all prior knowledge, but if you attempt to eliminate as much prior knowledge as possible, as Keynes does, then BPoIR proves to be an illusion. It does not give a unique result and some of the results are incompatible with their proof.

An Invitation to G Puccio

gpuccio addressed a comment to me at Uncommon Descent. Onlooker, a commenter now unable to post there,

(Added in edit 27/09/2012 – just to clarify, onlooker was banned from threads hosted by “kairosfocus” and can still post at Uncommon Descent in threads not authored by “kairosfocus”)

has expressed an interest in continuing a dialogue with gpuccio and petrushka comments:

By all means let’s have a gpuccio thread.

There are things I’d like to know about his position.

He claims that a non-material designer could insert changes into coding sequences. I’d like to know how that works. How does an entity having no matter or energy interact with matter and energy? Sounds to me like he is saying that A can sometimes equal not A.

He claims that variation is non stochastic and that adaptive adaptations are the result of algorithmic directed mutations. Is that in addition to intervention by non-material designers? How does that work?

What is the evidence that non-stochastic variation exists or that it is even necessary, given the Lenski experiment? Could he cite some evidence from the Lenski experiment that suggests directed mutations? Could he explain why gpuccio sees this and Lenski doesn’t?

It’s been a long time since gpuccio abandoned the discussion at the Mark Frank blog. I’d like to see that continued.

So I copy gpuccio’s comment here and add a few remarks hoping it may stimulate some interesting dialogue. Continue reading

Is ‘Design in Nature’ a Non-Starter?

A row is ready to erupt over two competing notions of ‘design in nature.’ One has been proposed under the auspices of being a natural-physical law. The other continues to clamour for public attention and respectability among natural-physical scientists, engineers and educators, but carries with it obvious religious overtones (Foundation for Thought and Ethics, Wedge Document and Dover trial 2005) and still has not achieved widespread scholarly support after almost 20 years of trying.

One the one hand is the Discovery Institute’s notion of ‘design in nature,’ which is repeated in various forms in the Intelligent Design movement. Here at TSZ many (the majority of?) people are against ID and ID proponents’ views of ‘design in nature.’ The author of this thread is likewise not an ID proponent, not an IDer. On the other hand is Duke University engineering and thermodynamics professor Adrian Bejan’s notion of ‘design in nature’ (Doubleday 2012, co-authored with journalism professor J. Peder Zane), which rejects Intelligent Design theory, but contends that ‘design’ is nevertheless a legitimate natural scientific concept. Apropos another recent thread here at TSZ, Bejan declares that his approach “solves one of the great riddles of science – design without a designer.”

Continue reading

LCI or No LCI, Information Can Appear by Chance

(Preamble: I apologize in advance for cluttering TSZ with these three posts. There are very few people on either side of the debate that actually care about the details of this “conservation of information” stuff, but these posts make good on some claims I made at UD.)

To see that active information can easily be created by chance, even when the LCI holds, we’ll return to the Bertrand’s Box example. Recall that the LCI holds for this example, and all choices are strictly random. Recall further that choosing the GG box gives us 1 bit of active information since it doubles our chance of getting a gold coin. If we conduct 100 trials, we expect to get the GG box about 33 times, which means we expect 33 bits of active information to be generated by nothing but chance.

But before we say QED, we should note a potential objection, namely that we also expect to get SS about 33 times, and each such outcome gives us negative infinity bits of active information. So if we include the SS outcomes in our tally of active information, the total is negative infinity. Be that as it may, the fact remains that in 33 of the trials, 1 bit of information was generated. This fact is not rendered false by the outcomes of other trials, so those 33 trials produced 33 bits of information.

A Free Lunch of Active Info, with a Side of LCI Violations

(Preamble: I apologize in advance for cluttering TSZ with these three posts. There are very few people on either side of the debate that actually care about the details of this “conservation of information” stuff, but these posts make good on some claims I made at UD.)

Given a sample space Ω and a target T ⊆ Ω, Dembski defines the following information measures:

Endogenous information: IS ≡ -log2( P(T) )
Exogenous information: IΩ ≡ -log2( |T|/|Ω| )
Active information: I+ ≡ IΩ – IS = log2( P(T) / |T|/|Ω|)

Active information is supposed to indicate design, but in fact, the amount of active info attributed to a process depends on how we choose to mathematically model that process. We can get as much free active info as we want simply by making certain modeling choices.

Free Active Info via Individuation of Possibilities

Dembski is in the awkward position of having impugned his own information measures before he even invented them. From his book No Free Lunch:

This requires a measure of information that is independent of the procedure used to individuate the possibilities in a reference class. Otherwise the same possibility can be assigned different amounts of information depending on how the other possibilities in the reference class are individuated (thus making the information measure ill-defined).

He used to make this point often. But two of his new information measures, “endogenous information” and “active information”, depend on the procedure used to individuate the possible outcomes, and are therefore ill-defined according to Dembski’s earlier position.

To see how this fact allows arbitrarily high measures of active information, consider how we model the rolling of a six-sided die. We would typically define Ω as the set {1, 2, 3, 4, 5, 6}. If the goal is to roll a number higher than one, then our target T is {2, 3, 4, 5, 6}. The amount of active information I+ is log2(P(T) / (|T|/|Ω|)) = log2((5/6) / (5/6)) = 0 bits.

But we could, instead, define Ω as {1, higher than 1}. In that case, I+ = log2((5/6) / (1/2)) = .7 bits. What we’re modeling hasn’t changed, but we’ve gained active information by making a different modeling choice.

Furthermore, borrowing an example from Dembski, we could distinguish getting a 1 with the die landing on the table from getting a 1 with the die landing on the floor. That is, Ω = { 1 on table, 1 on floor, higher than 1 }. Now I+ = log2((5/6) / (1/3)) = 1.3 bits. And we could keep changing how we individuate outcomes until we get as much active information as we desire.

This may seem like cheating. Maybe if we stipulate that Ω must always be defined the “right” way, then active information will be well-defined, right? But let’s look into another modeling choice that demonstrates that there is no “right” way to define Ω in the EIL framework.

Free Active Info via Inclusion of Possibilities

Again borrowing an example from Dembski, suppose that we know that there’s a buried treasure on the island of Bora Bora, but we have no idea where on the island it is, so all we can do is randomly choose a site to dig. If we want to model this search, it would be natural to define Ω as the set of all possible dig sites on Bora Bora. Our search, then, has zero active information, since it is no more likely to succeed than randomly selecting from Ω (because randomly selecting from Ω is exactly what we’re doing).

But is this the “right” definition of Ω? Dembski asks the question, “how did we know that of all places on earth where the treasure might be hidden, we needed to look on Bora Bora?” Maybe we should define Ω, as Dembski does, to include all of the dry land on earth. In this case, randomly choosing a site on Bora Bora is a high-active-information search, because it is far more likely to succeed than randomly choosing a site from Ω, i.e. the whole earth. Again, we have changed nothing about what is being modeled, but we have gained an enormous amount of active information simply by redefining Ω.

We could also take Dembski’s question further by asking, “how did we know that of all places in the universe, we needed to look on Bora Bora?” Now it seems that we’re being ridiculous. Surely we can take for granted the knowledge that the treasure is on the earth, right? No. Dembski is quite insistent that the zero-active-information baseline must involve no prior information whatsoever:

The “absence of any prior knowledge” required for uniformity conceptually parallels the difficulty of understanding the nothing that physics says existed before the Big Bang. It’s common to picture the universe before the Big Bang is a large black void empty space. No. This is a flawed image. Before the Big Bang there was nothing. A large black void empty space is something. So space must be purged from our visualization. Our next impulse is then, mistakenly, to say, “There was nothing. Then, all of a sudden…” No. That doesn’t work either. All of a sudden presupposes there was time and modern cosmology says that time in our universe was also created at the Big Bang. The concept of nothing must exclude conditions involving time and space. Nothing is conceptually difficult because the idea is so divorced from our experience and familiarity zones.

and further:

The “no prior knowledge” cited in Bernoulli’s PrOIR is all or nothing: we have prior knowledge about the search or we don’t. Active information on the other hand, measures the degree to which prior knowledge can contribute to the solution of a search problem.

To define a search with “no prior knowledge”, we must be careful not to constrain Ω. For example, if Ω consists of permutations, it must contain all permutations:

What search space, for instance, allows for all possible permutations? Most don’t. Yet, insofar as they don’t, it’s because they exhibit structures that constrain the permissible permutations. Such constraints, however, bespeak the addition of active information.

But even if we define Ω to include all permutations of a given ordered set, we’re still constraining Ω, as we’re excluding permutations of other ordered sets. We cannot define Ω without excluding something, so it is impossible to define a search without adding active information.

Active information is always measured relative to a baseline, and there is no baseline that we can call “absolute zero”. We therefore can attribute an arbitrarily large amount of active information to any search simply by choosing a baseline with a sufficiently large Ω.

Returning our six-sided die example, we can take the typical definition of Ω as {1, 2, 3, 4, 5, 6} and add, say, 7 and 8 to the set. Obviously our two additional outcomes each have a probability of zero, but that’s not a problem — probability distributions often include zero-probability elements. Inclusion of these zero-probability outcomes doesn’t change the mean, median, variance, etc. of the distribution, but it does change the amount of active info from 0 to log2((1/6) / (1/8)) = .4 bits (given a target of rolling, say, a one).

Free Violations of the LCI

Given a chain of two searches, the LCI says that the endogenous information of the first search is at least as large as the active information of the second. Since we can model the second search to have arbitrarily large active information, we can always model it such that its active information is larger than the first search’s endogenous information. Thus any chain of searches can be shown to violate the LCI. (We can also model the first search such that its endogenous information is arbitrarily large, so any chain of searches can also be shown to obey the LCI.)

The Law(?) of Conservation of Information

(Preamble: I apologize in advance for cluttering TSZ with these three posts. There are very few people on either side of the debate that actually care about the details of this “conservation of information” stuff, but these posts make good on some claims I made at UD.)

For the past three years Dembski has been promoting his Law of Conservation of Information (LCI), most recently here. The paper he most often promotes is this one, which begins as follows:

Laws of nature are universal in scope, hold with unfailing regularity, and receive support from a wide array of facts and observations. The Law of Conservation of Information (LCI) is such a law.

Dembski hasn’t proven that the LCI is universal, and in fact he claims that it can’t be proven, but he also claims that to date it has always been confirmed. He doesn’t say whether he as actually tried to find counterexamples, but the reality is that they are trivial to come up with. This post demonstrates one very simple counterexample.

Definitions

First we need to clarify Dembski’s terminology. In his LCI math, a search is described by a probability distribution over a sample space Ω. In other words, a search is nothing more than an Ω-valued random variable. Execution of the search consists of a single query, which is simply a realization of the random variable. The search is deemed successful if the realized outcome resides in target T ⊆ Ω. (We must be careful to not read teleology into the terms search, query, and target, despite the terms’ connotations. Obviously, Dembski’s framework must not presuppose teleology if it is to be used to detect design.)

If a search’s parameters depend on the outcome of a preceding search, then the preceding search is a search for a search. It’s this hierarchy of two searches that is the subject of the LCI, which we can state as follows.

Given a search S, we define:

  • q as the probability of S succeeding
  • p2 as the probability that S would succeed if it were a uniform distribution
  • p1 as the probability that a uniformly distributed search-for-a-search would yield a search at least as good as S

The LCI says that p1 ≤ p2/q.

Counterexample

In thinking of a counterexample to the LCI, we should remember that this two-level search hierarchy is nothing more than a chain of two random variables. (Dembski’s search hierarchy is like a Markov chain, except that each transition is from one state space to another, rather than within the same state space.) One of the simplest examples of a chain of random variables is a one-dimensional random walk. Think of a system that periodically changes state, with each state transition represented by a shift to the left or to the right on an state diagram. If we know at a certain point in time that it is in one of, say, three states, namely n-1 or n or n+1, then after the next transition it will be in n-2, n-1, n, n+1, or n+2, as in the following diagram:

Assume that the system is always equally likely to shift left as to shift right, and let the “target” be defined as the center node n. If the state at time t is, say, n-1, then the probability of success q is 1/2. Of the three original states, two (namely n-1 and n+1) yield this probability of success, so p1 is 2/3. Finally, p2 is 1/5 since the target consists of only one of the final five states. The LCI says that p1 ≤ p2/q. Plugging in our numbers for this example, we get 2/3 ≤ (1/5)/(1/2), which is clearly false.

Of course, the LCI does hold under certain conditions. To show that the LCI to biological evolution, Dembski needs to show that his mathematical model of evolution meets those conditions. This model would necessarily include the higher-level search that gave rise to the evolutionary process. As will be shown in the next post, the good news for Dembski is that any process can be modeled such that it obeys the LCI. The bad news is that any process can also be modeled such that it violates the LCI.

Is Any Form Of Atheism Rationally Justifiable?

Definition of God:   First cause, prime mover, objective source of human purpose (final cause) and resulting morality, source of free will; omnipotent, omniscient and omnipresent inasmuch as principles of logic allow. I am not talking in particular about any specifically defined religious interpretation of god, such as the christian or islamic god.

Definition: Intellectual dishonesty occurs when (1)one deliberately mischaracterizes their position or view in order to avoid having to logically defend their actual views; and/or (2) when someone is arguing, or making statements against a position while remaining willfully ignorant about that position, and/or (3) when someone categorically and/or pejoratively dismisses all existent and/or potential evidence in favor of a conclusion they claim to be neutral about, whether they are familiar with that evidence or not.

Continue reading

Gil Dodgen on “Transparent Lunacy”

In a recent UD post, Gil has been more specific than he often is, so I thought I would respond here:

The resolution of the debate about the creative powers of natural selection is dead simple and utterly trivial to figure out.

  1. Natural selection throws stuff out. Throwing stuff out has no creative power.
  2. Existing biological information, mixed and matched, can be filtered by natural selection, as in sexual reproduction, but nothing inherently new is created.
  3. Random errors can produce survivability quotients, but only in circumstances in which overall functional degradation supports survival in a pathological environment (e.g., bacterial antibiotic resistance), and only given massive probabilistic resources and a few trivial mutational events capable of producing the survival advantage.
  4. Random errors are inherently entropic, and the more complex a functionally-integrated system becomes, the more destructive random errors become. Anyone with any experience in even the most elementary engineering enterprise knows this.

To his first, I cite this:

Continue reading

Cornelius Hunter vs William Dembski?

Cornelius Hunter has posted an odd argument:

Is there evidence for evolution? Sure, there is plenty of evidence for evolution. But there are significant problems with evolution. There is plenty of evidence for evolution just as there is plenty of evidence for geocentrism. But the science does not bode well for either theory.

So the evidence for evolution follows this general pattern: Even at its best, it does not prove evolution to be a fact. And furthermore, the evidence reveals substantial problems with evolution.

So how can evolutionists proclaim evolution to be a fact with such fervor? There seems to be a glaring mismatch between the evidence and the truth claims of evolutionists. The answer is that evolutionists use contrastive reasoning. Evolution is not claimed to be a fact based on how well it fits the evidence, but rather on how poorly the alternative fits the evidence. Evolution is proved by the process of elimination.

In other words, Hunter is arguing directly against Dembski:.

In eliminating chance and inferring design, specified complexity is not party to an argument from ignorance. Rather, it is underwriting an eliminative induction. Eliminative inductions argue for the truth of a proposition by actively refuting its competitors (and not, as in arguments from ignorance, by noting that the proposition has yet to be refuted)

Continue reading

Is purpose necessary to acquire any apparently purposeful effects?

For purposes of this discussion.

.
Chance = non-teleological causes that happen to result in particular effects via regularities referred to as “lawful” and stochastic in nature.

.
Purpose = teleological causes that are intended to result in particular effects; the organization of causes towards a pre-defined future goal.

.
My question is: can chance causes generate all of the effects normally associated with purpose,but without purpose? IOW, is purpose necessary to produce all, most, or some apparently purposeful effects, or is purpose, in effect, only an associated sensation by-product or side-effect that isn’t necessary to the generation of any particular effect normally associated with it?

The LCI and Bertrand’s Box

Tom English has recommended that we read Dembski and Marks’ paper on their Law of Conservation of Information (not to be confused with the Dembski’s previous LCI from his book No Free Lunch). Dembski also has touted the paper several times, and I too recommend it as a stark display of the the authors’ thinking.

Most people won’t take the time to carefully read a 34-page paper, but I submit that the authors’ core concept of “conservation of information” is very easily understood if we avoid equivocal and misleading terms such as information, search, and target. I’ll illustrate it with a setup borrowed from Joseph Bertrand.

The “Bertrand’s box” scenario is as follows: We’re presented with three small outwardly identical boxes, each containing two coins. One has a two silver coins, one has two gold coins, and one has a silver coin and a gold coin. We’ll call the boxes SS, GG, and SG. We are to randomly choose a box, and then randomly pull a coin from the chosen box.

Continue reading

More on Marks, Dembski, and No Free Lunch, by Tom English

Tom English has a great post at his blog, Bounded Science, which I have his permission to cross post here:

Bob Marks grossly misunderstands “no free lunch”

And so does Bill Dembski. But it is Marks who, in a “Darwin or Design?” interview, reveals plainly the fallacy at the core of his and Dembski’s notion of “active information.” (He gets going at 7:50. To select a time, it’s best to put the player in full-screen mode. I’ve corrected slips of the tongue in my transcript.)

[The “no free lunch” theorem of Wolpert and Macready] said that with a lack of any knowledge about anything, that one search was as good as any other search. [14:15]And what Wolpert and Macready said was, my goodness, none of these [“search”] algorithms work as well as [better than] any other one, on the average, if you have no idea what you’re doing. And so the question is… and what we’ve done here is, if indeed that is true, and an algorithm works, then that means information has been added to the search. And what we’ve been able to do is take this baseline, that all searches are the same, and we’ve been able to, in cases where searches work, measure the information that is placed into the algorithm in bits. And we have looked at some of the evolutionary algorithms, and we found out that, strikingly, they are not responsible for any creation of information. [14:40]

And according to “no free lunch” theorems, astonishingly, any search, without information about the problem that you’re looking for, will operate at the same level as blind search.” And that’s… It’s a mind-boggling result. [28:10]

Bob has read into the “no free lunch” (NFL) theorem what he believed in the first place, namely that if something works, it must have been designed to do so. Although he gets off to a good start by referring to the subjective state of the practitioner (“with a lack of knowledge,” “if you have no idea what you’re doing”), he errs catastrophically by making a claim about the objective state of affairs (“one search is as good as any other search,” “all searches are the same”).

Continue reading

A Few Comments on A Vivisection of the ev Computer Organism

I’ll follow Patrick’s lead and offer a few comments on another paper from the Evolutionary Informatics Lab. The paper analyzes Tom Schneider’s ev program, and while there are several problems with the analysis, I’ll focus on the first two sentences of the conclusions:

The success of ev is largely due to active information introduced by the Hamming oracle and from the perceptron structure. It is not due to the evolutionary algorithm used to perform the search.

To explain the authors’ terminology, active information is defined quantitatively as a measure of relative search performance — to say that something provides N bits of active information is to say that it increases the probability of success by a factor of 2N. The Hamming oracle is a function that reports the Hamming distance between the its input and a fixed target. The perceptron structure is another function whose details aren’t important to this post. Figure 1 shows how these three components are connected in an iterative feedback loop.

Continue reading

Review: Climbing the Steiner Tree

Overview

Winston Ewert, William Dembski, and Robert J. Marks II have made available a paper titled Climbing the Steiner Tree — Sources of Active Information in a Genetic Algorithm for Solving the Euclidean Steiner Tree Problem wherein they claim to have identified sources of “active information” in a genetic algorithm used to find solutions to a Steiner problem. The problem referenced in this paper was originally described by Dave Thomas on the Panda’s Thumb almost six years ago. I developed a GA to solve the problem that Dave posed as a challenge a few weeks later, as did a number of other people.

This paper suffers from numerous flaws, starting with a fundamental mischaracterization of the purpose of Thomas’s solution and including general misunderstanding of genetic algorithms, misapplication of the No Free Lunch theorems, spurious claims about “active information”, and incorrect and unsupported assertions regarding the impact of certain GA implementation details.
Continue reading

Semiotic theory of ID

Upright BiPed has been proposing what he has called a “semiotic” theory of Intelligent Design, for a while, which I have found confusing, to say the least.  However, he is honing his case, and asks Nick Matzke

…these three pertinent questions regarding the existence of information within a material universe:

  1. In this material universe, is it even conceivably possible to record transferable information without utilizing an arrangement of matter in order to represent that information? (by what other means could it be done?)
  2. If 1 is true, then is it even conceivably possible to transfer that information without a second arrangement of matter (a protocol) to establish the relationship between representation and what it represents? (how could such a relationship be established in any other way?)
  3. If 1 and 2 are true, then is it even conceivably possible to functionally transfer information without the irreducibly complex system of these two arrangements of matter (representations and protocols) in operation?

… which I think clarify things a little.

I think I can answer them, but would anyone else like to have a go? (I’m out all day today).

A Second Look at the Second Law…

…is the title of Granville Sewell’s manuscript that almost got published in Applied Mathematics Letters last year. It was withdrawn at the last minute by the editor, but you can still download the manuscript from Sewell’s web page. The purpose of this thread is to discuss the technical merits of Sewell’s arguments.

Continue reading

No Free Lunch

My husband, mother, father, myself, and my four-year-old son were going out for a walk.  It was raining. My son refused (as usual) to wear his raincoat.  Instead, he carried a cup, which he held out in front of him.  He argued that he was going to catch the rain drops in the cup so that by the time he got to the place the raindrops had been, they’d be in the cup and he’d be dry. Half an hour later, four adults were still standing around, drawing diagrams on the backs of envelopes, arguing about Pythagoras and trigonometry, all to no avail.  We went out, with cup, sans rain coat.  My son got wet.  He insisted he remained dry.

Bryce Canyon, Utah.

I’ve got as far as Chapter 5 of Dembski’s book No Free Lunch, the chapter called Evolutionary Algorithms, and about which he says in his Preface: “This chapter is the climax of the book”.  He claims that in it he shows that “An elementary combinatorial analysis shows that evolutionary algorithms can no more generate specified complexity than can five letters fill ten mailboxes.”

I think he’s making the same kind of error as my son made.

Continue reading

Intelligence and Design.

My copy of No Free Lunch arrived a few days ago, and there are a couple of posts I want to make about it, but the first thing that struck me, reading the preface, and not for the first time, is how little Dembski (and other Intelligent Design proponents) seem to know about either Intelligence or Design.

As it happens, I have a relevant background in both.  I’m a cognitive scientist, and I came into cognitive science from a background in educational psychology, so I’ve always been interested in intelligence – how it works, how it is measured, what factors affect it, etc.  And, somewhat unusually for a cognitive scientist, I also have a training in design – I trained as an architect, a design training that is specifically focussed on “problem solving”, but I also applied that training to other “design” modalities, including composing music, and writing children’s books that attempted to explain something, both to commission, and therefore with a “design brief”.

And in both areas, what is abundantly clear, is that learning is critical.

Continue reading