Evolution and Functional Information

Here, one of my brilliant MD PhD students and I study one of the “information” arguments against evolution. What do you think of our study?

I recently put this preprint in biorxiv. To be clear, this study is not yet peer-reviewed, and I do not want anyone to miss this point. This is an “experiment” too. I’m curious to see if these types of studies are publishable. If they are, you might see more from me. Currently it is under review at a very good journal. So it might actually turn the corner and get out there. An a parallel question: do you think this type of work should be published?

 

I’m curious what the community thinks. I hope it is clear enough for non-experts to follow too. We went to great lengths to make the source code for the simulations available in an easy to read and annotated format. My hope is that a college level student could follow the details. And even if you can’t, you can weigh in on if the scientific community should publish this type of work.

Functional Information and Evolution

http://www.biorxiv.org/content/early/2017/03/06/114132

“Functional Information”—estimated from the mutual information of protein sequence alignments—has been proposed as a reliable way of estimating the number of proteins with a specified function and the consequent difficulty of evolving a new function. The fantastic rarity of functional proteins computed by this approach emboldens some to argue that evolution is impossible. Random searches, it seems, would have no hope of finding new functions. Here, we use simulations to demonstrate that sequence alignments are a poor estimate of functional information. The mutual information of sequence alignments fantastically underestimates of the true number of functional proteins. In addition to functional constraints, mutual information is also strongly influenced by a family’s history, mutational bias, and selection. Regardless, even if functional information could be reliably calculated, it tells us nothing about the difficulty of evolving new functions, because it does not estimate the distance between a new function and existing functions. Moreover, the pervasive observation of multifunctional proteins suggests that functions are actually very close to one another and abundant. Multifunctional proteins would be impossible if the FI argument against evolution were true.

True or false? Log-improbability is Shannon information

True or false? If p is the probability of an event, then the Shannon information of the event is -\!\log_2 p bits.

I’m quite interested in knowing what you believe, and why you believe it, even if you cannot justify your belief formally.

Formal version. Let (\Omega, 2^\Omega, P) be a discrete probability space with P(\Omega) = 1, and let event E be an arbitrary subset of \Omega. Is it the case that in Shannon’s mathematical theory of communication, the self-information of the event is equal to -\!\log_2 P(E) bits?

Dice Entropy – A Programming Challenge

Given the importance of information theory to some intelligent design arguments I thought it might be nice to have a toolkit of some basic functions related to the sorts of calculations associated with information theory, regardless of which side of the debate one is on.

What would those functions consist of?

Continue reading

Boltzmann Brains and evolution

In the “Elon Musk” discussion, in the midst of a whole lotta epistemology goin’ on, commenter BruceS referred to the concept of a “Boltzmann Brain” and suggested that Boltzmann didn’t know about evolution. (In fact Boltzmann did know about evolution and thought Darwin’s work was hugely important). The Boltzmann Brain is a thought experiment about a conscious brain arising in a thermodynamic system which is at equilibrium. Such a thing is interesting but vastly improbable.

BruceS explained that he was thinking of a reddit post where the commenter invoked evolution to explain why we don’t need extremely improbable events to explain the existence of our brains (the comment will be found here).

What needs to be added is that all that does not happen in an isolated system at thermodynamic equilibrium, or at least it has a fantastically low probability of happening there.  The earth-sun system is not at thermodynamic equilibrium.  Energy is flowing outwards from the sun, at high temperature, some is hitting the earth, and some is taken up by plants and then some by animals, at lower temperatures. Continue reading

The Real EleP(T|H)ant in the Room

TSZ has made much ado about P(T|H), a conditional probability based on a materialistic hypothesis. They don’t seem to realize that H pertains to their position and that H cannot be had means their position is untestable. The only reason the conditional probability exists in the first place is due to the fact that the claims of evolutionists cannot be directly tested in a lab. If their claims could be directly tested then there wouldn’t be any need for a conditional probability.

If P(T|H) cannot be calculated it is due to the failure of evolutionists to provide H and their failure to find experimental evidence to support their claims.

I know what the complaints are going to be- “It is Dembski’s metric”- but yet it is in relation to your position and it wouldn’t exist if you actually had something that could be scientifically tested.

 

 

Philosophy and Complexity of Rube Goldberg Machines

Michael Behe is best known for coining the phrase Irreducible Complexity, but I think his likening of biological systems to Rube Goldberg machines is a better way to frame the problem of evolving the black boxes and the other extravagances of the biological world.
Continue reading

Intention, Intelligence and Teleology

On the left is a photograph of a real snowflake.  Most people would agree that it was not created intentionally, except possibly in the rather esoteric sense of being the foreseen result of the properties of water atoms in an intentionally designed universe in which water atoms were designed to have those properties.  But I think most people here, ID proponents and ID critics alike, would consider that the “design” (in the sense of “pattern”) of this snowflake is neither random nor teleological.  Nor, however, is it predictable in detail.  Famously “no two snowflakes are alike”, yet all snowflakes have six-fold rotational symmetry.  They are, to put it another way, the products of both “law” (the natural law that governs the crystalisation of water molecules) and “chance” (stochastic variation in humidity and temperature that affect the rate of growth of each arm of the crystal as it grows). We need not, to continue in Dembski’s “Explanatory Filter” framework, infer “Design”.

Continue reading

The Myth of Biosemiotics

I recently came across this book:

Biosemiotics: Information, Codes and Signs in Living Systems

This new book presents contexts and associations of the semiotic view in biology, by making a short review of the history of the trends and ideas of biosemiotics, or semiotic biology, in parallel with theoretical biology. Biosemiotics can be defined as the science of signs in living systems. A principal and distinctive characteristic of semiotic biology lies in the understanding that in living, entities do not interact like mechanical bodies, but rather as messages, the pieces of text. This means that the whole determinism is of another type.

Pardon my skepticism, but

  1. There is no information in living systems.
  2. There are no codes in living systems.
  3. There are no signs in living systems.

Biosemiotics is the study of things that just don’t exist. Theology for biologists.

Continue reading

What A Code Is – Code Denialism Part 3

My intent here in these recent posts on the genetic code has been to expose the absurdity of Code Denialism. The intent has not been to make the case for intelligent design based upon the existence of biological codes. I know some people find that disconcerting but that would be putting the cart before the horse. No one is going to accept a conclusion when they deny the premise. And please forgive me if I choose not to play the game of “let’s pretend it really is a code” while you continue to deny that it actually is a code.

First I’d like to thank you. It’s actually been pretty neat looking up and reading many of these resources in my attempt to see whether I could defend the thesis that the genetic code is a real code. I admit it’s also been much too much fun digging up all the reasons why code denialism is just plain silly (and irrational).

That the genetic code is a code is common usage and if “meaning is use” that alone ought to settle the matter. But this is “The Skeptical Zone” and Code Denialism is strong here. But I’m not just claiming that it’s a code because we say it’s a code in common usage. I’m claiming it is a code because it meets the definition of a code. The reason we say it is a code is because it is in fact a code.

My first two posts have been on some of the major players and how they understood they were dealing with a code and how that guided their research. I’ll have more to say on that in the future as it’s a fascinating story. But for now …

What A Code Is

Continue reading

Repetitive DNA and ENCODE

[Here is something I just sent Casey Luskin and friends regarding the ENCODE 2015 conference. Some editorial changes to protect the guilty…]

One thing the ENCODE consortium drove home is that DNA acts like a Dynamic Random Access memory for methylation marks. That is to say, even though the DNA sequence isn’t changed, like computer RAM which isn’t physically removed, it’s electronic state can be modified. The repetitive DNA acts like physical hardware so even if the repetitive sequences aren’t changed, they can still act as memory storage devices for regulatory information. ENCODE collects huge amounts of data on methylation marks during various stages of the cell. This is like trying to take a few snapshots of a computer memory to figure out how Windows 8 works. The complexity of the task is beyond description.
Continue reading

The Sugar Code and other -omics

[Thank you to Elizabeth Liddle, the admins and the mods for hosting this discussion.]

I’ve long suspected the 3.1 to 3.5 gigabases of human DNA (which equates to roughly 750 to 875 megabytes) is woefully insufficient to create something as complex as a human being. The problem is there is only limited transgenerational epigenetic inheritance so it’s hard to assert large amounts of information are stored outside the DNA.

Further, the question arises how is this non-DNA information stored since it’s not easy to localize, in fact, if there is a large amount of information outside the DNA, it is in a form that is NOT localizable, but distributed and so deeply redundant that it provides the ability to self-heal and self-correct for injury and error. If so, in a sense, damage and changes to this information bearing system is not very heritable since bad variation in the non-DNA information source can get repaired and reset, otherwise the organism just dies. In that sense the organism is fundamentally immutable as a form, suggestive of a created kind rather than something that can evolve in the macro-evolutionary sense.
Continue reading

CSI-free Explanatory Filter…

…Gap Highlighter, Design Conjecture

Though I’ve continued to endear myself to the YEC community, I’ve certainly made myself odious in certain ID circles. I’ve often been the lone ID proponent to vociferously protest cumbersome, ill-conceived, ill-advised, confusing and downright wrong claims by some ID proponents. Some of the stuff said by ID proponents is of such poor quality they are practically gifts to Charles Darwin. I teach ID to university science students in extra curricular classes, and some of the stuff floating around in ID internet circles I’d never touch because it would cause my students to impale themselves intellectually.
Continue reading

Good UD post

Good guest post at Uncommon Descent by Aurelio Smith,

SIGNAL TO NOISE: A CRITICAL ANALYSIS OF ACTIVE INFORMATION

For those who prefer to comment here, this is your thread!

For me, the argument by Ewert Dembski and Marks reminds me of poor old Zeno and his paradox.  They’ve over-thought the problem and come to a conclusion that appears mathematically valid, but actually makes no sense.  Trying to figure out just the manner in which it makes no sense isn’t that easy, though I don’t think we need to invent the equivalent of differential calculus to solve it in this case.  I think it’s a simple case of picking the wrong model.  Evolution is not a search for anything, and information is not the same as [im]probability, whether you take log2 of it or not.  Which means that you don’t need to add Active Information to an Evolutionary Search in order to find a Target, because there’s no Target, no search, and the Active Information is simply the increased probability of solving a problem if you have some sort of feedback for each attempt, and partial solutions are moderately similar to better ones.

Enjoy!

2LOT and ID entropy calculations (editorial corrections welcome)

Some may have wondered why me (a creationist) has taken the side of the ID-haters with regards to the 2nd law. It is because I am concerned for the ability of college science students in the disciplines of physics, chemistry and engineering understanding the 2nd law. The calculations I’ve provided are textbook calculations as would be expected of these students.
Continue reading

Wagner’s Multidimensional Library of Babel (Piotr at UD)

I’ve wanted to start this discussion for several weeks, but wasn’t sure how to present Wagner’s argument. Fortunately Piotr has saved me the trouble with a post at UD.

Piotr February 24, 2015 at 1:35 pm
Gpuccio,

Do you mind if I begin with a simple illustrative example? Let’s consider all five-letter alphabetic strings (AAAAA, QWERT, HGROF, etc.). By convention, a string will be “functional” if it’s a meaningful English word (BREAD, WATER, GLASS, etc.). Functionality is therefore not a formal property of the string but something dictated by the environment. There are 26^5 = 11881376 (almost 12 million) possible five-letter strings. The number of five-letter words in English (excluding proper nouns and extremely rare, dialectal or archaic words) is about 6000, so the probability that any randomly generated string is functional is about 0.0005.

Any five-letter string S can produce 5×25 = 125 “mutants” differing from S by exactly one letter. If you represent the sequence space as a five-dimensional hypercube (26x26x26x26x26), a mutation can be defined as a translation along any of the five axes.

It would appear that the odds of finding a functional mutant for a given string should be about 125×0.0005 = 1/16 on the average. In fact, however, it depends where you start. If S is functional, the existence of at least one functional mutant is almost guaranteed (close to 90%). For most English words there are more than one functional mutants. For example, from SNARE wer get {SCARE, SHARE, SPARE, STARE, SNORE, SNAKE, SNARK…}. Though some functional sequences are isolated or form small clusters in the sequence space, most of them are members of one huge, quite densely interconnected network. You can get from one to another in just a few steps (often in more than one way), which is of course what Lewis Carroll’s “word ladder” puzzle is about:

FLOUR > FLOOR > FLOOD > BLOOD > BROOD > BROAD > BREAD

You can ponder the example for a moment; I’ll return to it later.

The Elephant in the Room

The whole thread is worth a look.

I might add that there is a rather crude GA at http://itatsi.com that does something not entirely unlike a word ladder.

Junk DNA

Well, I just got banned again at UD, over my response to this post of Barry’s:

In a prior post I took Dr. Liddle (sorry for the misspelled name) to task for this statement:

“Darwinian hypotheses make testable predictions and ID hypotheses (so far) don’t.”

I responded that this was not true and noted that:

For years Darwinists touted “junk DNA” as not just any evidence but powerful, practically irrefutable evidence for the Darwinian hypothesis. ID proponents disagreed and argued that the evidence would ultimately demonstrate function.

Not only did both hypotheses make testable predictions, the Darwinist prediction turned out to be false and the ID prediction turned out to be confirmed

Continue reading

Siding with Mathgrrl on a point,and offering an alternative to CSI v2.0

[cross posted from UD Siding with Mathgrrl on a point, and offering an alternative to CSI v2.0, special thanks to Dr. Liddle for her generous invitation to cross post]

There are two versions of the metric for Bill Dembski’s CSI. One version can be traced to his book No Free Lunch published in 2002. Let us call that “CSI v1.0”.

Then in 2005 Bill published Specification the Pattern that Signifies Intelligence where he includes the identifier “v1.22”, but perhaps it would be better to call the concepts in that paper CSI v2.0 since, like windows 8, it has some radical differences from its predecessor and will come up with different results. Some end users of the concept of CSI prefer CSI v1.0 over v2.0.
Continue reading