Does gpuccio’s argument that 500 bits of Functional Information implies Design work?

On Uncommon Descent, poster gpuccio has been discussing “functional information”. Most of gpuccio’s argument is a conventional “islands of function” argument. Not being very knowledgeable about biochemistry, I’ll happily leave that argument to others.

But I have been intrigued by gpuccio’s use of Functional Information, in particular gpuccio’s assertion that if we observe 500 bits of it, that this is a reliable indicator of Design, as here, about at the 11th sentence of point (a):

… the idea is that if we observe any object that exhibits complex functional information (for example, more than 500 bits of functional information ) for an explicitly defined function (whatever it is) we can safely infer design.

I wonder how this general method works. As far as I can see, it doesn’t work. There would be seem to be three possible ways of arguing for it, and in the end; two don’t work and one is just plain silly. Which of these is the basis for gpuccio’s statement? Let’s investigate …

Biological Information

  1. ‘Information’, ‘data’ and ‘media’ are distinct concepts. Media is the mechanical support for data and can be any material including DNA and RNA in biology. Data is the symbols that carry information and are stored and transmitted on the media. ACGT nucleotides forming strands of DNA are biologic data. Information is an entity that answers a question and is represented by data encoded on a particular media. Information is always created by an intelligent agent and used by the same or another intelligent agent. Interpreting the data to extract information requires a deciphering key such as a language. For example, proteins are made of amino acids selected based on a translation table (the deciphering key) from nucleotides.
Evolutionary Informatics catches modelers doing modeling

As Tom English and others have discussed previously, there was a book published last year called Introduction to Evolutionary Informatics, the authors of which are Marks, Dembski, and Ewert.

The main point of the book is stated as:

Indeed, all current models of evolution require information from an external designer to work.

(The “external designer” they are talking about is the modeler who created the model.)

Another way they state their position:

We show repeatedly that the proposed models all require inclusion of significant knowledge about the problem being solved.

Somehow, they think it needs to be shown that modelers put information and knowledge into their models. This displays a fundamental misunderstanding of models and modeling.

It is a simple fact that a model of any kind, in its entirety, comes from a modeler. Any information in the model, however one defines information, is put in the model by the modeler. All structures and behaviors of any model are results of modeling decisions made by the modeler. Models are the modelers’ conceptions of reality. It is expected that modelers will add the best information they think they have in order to make their models realistic. Why wouldn’t they? For people who actually build and use models, like engineers and scientists, the main issue is realism.

To see a good presentation on the fundamentals of modeling, I recommend the videos and handbooks available free online from the Society for Industrial and Applied Mathematics (SIAM.) “[Link]”.

For a good discussion on what it really means for a model to “work,” I recommend a paper called “Concepts of Model Verification and Validation”, which was put out by the Los Alamos Laboratories.

Prof. Marks gets lucky at Cracker Barrel

Introduction to Evolutionary Informatics, by Robert J. Marks II, the “Charles Darwin of Intelligent Design”; William A. Dembski, the “Isaac Newton of Information Theory”; and Winston Ewert, the “Charles Ingram of Active Information.” World Scientific, 332 pages.
Yesterday, I looked again through “Introduction to Evolutionary Informatics”, when I spotted the Cracker Barrel puzzle in section Endogenous information of the Cracker Barrel puzzle (p. 128). The rules of this variant of a triangular peg-solitaire are described in the text (or can be found at wikipedia’s article on the subject).

The humble authors1 then describe a simulation of the game to calculate how probable it is to solve the puzzle using moves at random:
Evolution and Functional Information

Here, one of my brilliant MD PhD students and I study one of the “information” arguments against evolution. What do you think of our study?

I recently put this preprint in biorxiv. To be clear, this study is not yet peer-reviewed, and I do not want anyone to miss this point. This is an “experiment” too. I’m curious to see if these types of studies are publishable. If they are, you might see more from me. Currently it is under review at a very good journal. So it might actually turn the corner and get out there. An a parallel question: do you think this type of work should be published?


I’m curious what the community thinks. I hope it is clear enough for non-experts to follow too. We went to great lengths to make the source code for the simulations available in an easy to read and annotated format. My hope is that a college level student could follow the details. And even if you can’t, you can weigh in on if the scientific community should publish this type of work.

Functional Information and Evolution

“Functional Information”—estimated from the mutual information of protein sequence alignments—has been proposed as a reliable way of estimating the number of proteins with a specified function and the consequent difficulty of evolving a new function. The fantastic rarity of functional proteins computed by this approach emboldens some to argue that evolution is impossible. Random searches, it seems, would have no hope of finding new functions. Here, we use simulations to demonstrate that sequence alignments are a poor estimate of functional information. The mutual information of sequence alignments fantastically underestimates of the true number of functional proteins. In addition to functional constraints, mutual information is also strongly influenced by a family’s history, mutational bias, and selection. Regardless, even if functional information could be reliably calculated, it tells us nothing about the difficulty of evolving new functions, because it does not estimate the distance between a new function and existing functions. Moreover, the pervasive observation of multifunctional proteins suggests that functions are actually very close to one another and abundant. Multifunctional proteins would be impossible if the FI argument against evolution were true.

True or false? Log-improbability is Shannon information

True or false? If p is the probability of an event, then the Shannon information of the event is -\!\log_2 p bits.

I’m quite interested in knowing what you believe, and why you believe it, even if you cannot justify your belief formally.

Formal version. Let (\Omega, 2^\Omega, P) be a discrete probability space with P(\Omega) = 1, and let event E be an arbitrary subset of \Omega. Is it the case that in Shannon’s mathematical theory of communication, the self-information of the event is equal to -\!\log_2 P(E) bits?

Dice Entropy – A Programming Challenge

Given the importance of information theory to some intelligent design arguments I thought it might be nice to have a toolkit of some basic functions related to the sorts of calculations associated with information theory, regardless of which side of the debate one is on.

What would those functions consist of?

Boltzmann Brains and evolution

In the “Elon Musk” discussion, in the midst of a whole lotta epistemology goin’ on, commenter BruceS referred to the concept of a “Boltzmann Brain” and suggested that Boltzmann didn’t know about evolution. (In fact Boltzmann did know about evolution and thought Darwin’s work was hugely important). The Boltzmann Brain is a thought experiment about a conscious brain arising in a thermodynamic system which is at equilibrium. Such a thing is interesting but vastly improbable.

BruceS explained that he was thinking of a reddit post where the commenter invoked evolution to explain why we don’t need extremely improbable events to explain the existence of our brains (the comment will be found here).

What needs to be added is that all that does not happen in an isolated system at thermodynamic equilibrium, or at least it has a fantastically low probability of happening there.  The earth-sun system is not at thermodynamic equilibrium.  Energy is flowing outwards from the sun, at high temperature, some is hitting the earth, and some is taken up by plants and then some by animals, at lower temperatures. Continue reading

The Real EleP(T|H)ant in the Room

TSZ has made much ado about P(T|H), a conditional probability based on a materialistic hypothesis. They don’t seem to realize that H pertains to their position and that H cannot be had means their position is untestable. The only reason the conditional probability exists in the first place is due to the fact that the claims of evolutionists cannot be directly tested in a lab. If their claims could be directly tested then there wouldn’t be any need for a conditional probability.

If P(T|H) cannot be calculated it is due to the failure of evolutionists to provide H and their failure to find experimental evidence to support their claims.

I know what the complaints are going to be- “It is Dembski’s metric”- but yet it is in relation to your position and it wouldn’t exist if you actually had something that could be scientifically tested.



Philosophy and Complexity of Rube Goldberg Machines

Michael Behe is best known for coining the phrase Irreducible Complexity, but I think his likening of biological systems to Rube Goldberg machines is a better way to frame the problem of evolving the black boxes and the other extravagances of the biological world.
Intention, Intelligence and Teleology

On the left is a photograph of a real snowflake.  Most people would agree that it was not created intentionally, except possibly in the rather esoteric sense of being the foreseen result of the properties of water atoms in an intentionally designed universe in which water atoms were designed to have those properties.  But I think most people here, ID proponents and ID critics alike, would consider that the “design” (in the sense of “pattern”) of this snowflake is neither random nor teleological.  Nor, however, is it predictable in detail.  Famously “no two snowflakes are alike”, yet all snowflakes have six-fold rotational symmetry.  They are, to put it another way, the products of both “law” (the natural law that governs the crystalisation of water molecules) and “chance” (stochastic variation in humidity and temperature that affect the rate of growth of each arm of the crystal as it grows). We need not, to continue in Dembski’s “Explanatory Filter” framework, infer “Design”.

The Myth of Biosemiotics

I recently came across this book:

Biosemiotics: Information, Codes and Signs in Living Systems

This new book presents contexts and associations of the semiotic view in biology, by making a short review of the history of the trends and ideas of biosemiotics, or semiotic biology, in parallel with theoretical biology. Biosemiotics can be defined as the science of signs in living systems. A principal and distinctive characteristic of semiotic biology lies in the understanding that in living, entities do not interact like mechanical bodies, but rather as messages, the pieces of text. This means that the whole determinism is of another type.

Pardon my skepticism, but

  1. There is no information in living systems.
  2. There are no codes in living systems.
  3. There are no signs in living systems.

Biosemiotics is the study of things that just don’t exist. Theology for biologists.

What A Code Is – Code Denialism Part 3

My intent here in these recent posts on the genetic code has been to expose the absurdity of Code Denialism. The intent has not been to make the case for intelligent design based upon the existence of biological codes. I know some people find that disconcerting but that would be putting the cart before the horse. No one is going to accept a conclusion when they deny the premise. And please forgive me if I choose not to play the game of “let’s pretend it really is a code” while you continue to deny that it actually is a code.

First I’d like to thank you. It’s actually been pretty neat looking up and reading many of these resources in my attempt to see whether I could defend the thesis that the genetic code is a real code. I admit it’s also been much too much fun digging up all the reasons why code denialism is just plain silly (and irrational).

That the genetic code is a code is common usage and if “meaning is use” that alone ought to settle the matter. But this is “The Skeptical Zone” and Code Denialism is strong here. But I’m not just claiming that it’s a code because we say it’s a code in common usage. I’m claiming it is a code because it meets the definition of a code. The reason we say it is a code is because it is in fact a code.

My first two posts have been on some of the major players and how they understood they were dealing with a code and how that guided their research. I’ll have more to say on that in the future as it’s a fascinating story. But for now …

What A Code Is

Repetitive DNA and ENCODE

[Here is something I just sent Casey Luskin and friends regarding the ENCODE 2015 conference. Some editorial changes to protect the guilty…]

One thing the ENCODE consortium drove home is that DNA acts like a Dynamic Random Access memory for methylation marks. That is to say, even though the DNA sequence isn’t changed, like computer RAM which isn’t physically removed, it’s electronic state can be modified. The repetitive DNA acts like physical hardware so even if the repetitive sequences aren’t changed, they can still act as memory storage devices for regulatory information. ENCODE collects huge amounts of data on methylation marks during various stages of the cell. This is like trying to take a few snapshots of a computer memory to figure out how Windows 8 works. The complexity of the task is beyond description.
The Sugar Code and other -omics

[Thank you to Elizabeth Liddle, the admins and the mods for hosting this discussion.]

I’ve long suspected the 3.1 to 3.5 gigabases of human DNA (which equates to roughly 750 to 875 megabytes) is woefully insufficient to create something as complex as a human being. The problem is there is only limited transgenerational epigenetic inheritance so it’s hard to assert large amounts of information are stored outside the DNA.

Further, the question arises how is this non-DNA information stored since it’s not easy to localize, in fact, if there is a large amount of information outside the DNA, it is in a form that is NOT localizable, but distributed and so deeply redundant that it provides the ability to self-heal and self-correct for injury and error. If so, in a sense, damage and changes to this information bearing system is not very heritable since bad variation in the non-DNA information source can get repaired and reset, otherwise the organism just dies. In that sense the organism is fundamentally immutable as a form, suggestive of a created kind rather than something that can evolve in the macro-evolutionary sense.
CSI-free Explanatory Filter…

…Gap Highlighter, Design Conjecture

Though I’ve continued to endear myself to the YEC community, I’ve certainly made myself odious in certain ID circles. I’ve often been the lone ID proponent to vociferously protest cumbersome, ill-conceived, ill-advised, confusing and downright wrong claims by some ID proponents. Some of the stuff said by ID proponents is of such poor quality they are practically gifts to Charles Darwin. I teach ID to university science students in extra curricular classes, and some of the stuff floating around in ID internet circles I’d never touch because it would cause my students to impale themselves intellectually.
