Active information defined… for the fourth time in nine years?

Posted on September 30, 2016 by Tom English

Yesterday, a couple of folks let me know of a paper that crypto-creationist [ETA: perhaps under reform] George Montañez had just posted at arXiv, “The Famine of Forte: Few Search Problems Greatly Favor Your Algorithm.” Below you’ll find my response to one of them. I should explain a few things, by way of introduction.

Montañez is a former advisee of the “Charles Darwin of intelligent design,” Baylor University professor Robert J. Marks II. Last I heard, he was pursuing doctoral studies in machine learning at Carnegie Mellon University. He worked not only with Marks, but also with William A. Dembski, the “Isaac Newton of information theory,” and Winston Ewert, the “Pooh Bear of evolutionary informatics,” on applications of measures of active information. He is still affiliated with them at the Evolutionary Informatics Lab. I refer to the core of affiliates who actually contribute to the output of the Lab — Marks, Dembski, Ewert, and Montañez — as Team EIL. The first three of them have a book scheduled for release by World Scientific on January 30, 2017. The title is Introduction to Evolutionary Informatics. I am trying to pull together a series of posts with the same title.

My email note follows.

[ETA: George Montañez has kindly responded here at TSZ. Contrary to what I guess below, he is not presently collaborating with the authors of the book.]

There’s an important development in the paper. In their studies of ev, Avida, etc., Team EIL did not apply their measures of active information, for which they had “conservation of information” (CoI) theorems, but instead their egregiously misnamed measures of average active information per query (AAIPQ). In it’s simplest form, AAIPQ is the endogenous (not active) information $I_\Omega$ divided by the expected number of queries for the sampling process to first hit the target,

$I_\oplus = \frac{I_\Omega}{E[Q]}.$

Active information was defined as the difference of endogenous information and exogenous information,

$I_+ = I_\Omega - I_S,$

and endogenous information was thus the sum of active information and exogenous information,

$I_\Omega = I_+ + I_S.$

You don’t even need to know the definitions of endogenous information and exogenous information to see that AAIPQ is active information per expected number of queries plus exogenous information per expected number of queries,

$\begin{align*} I_\oplus &= \frac{I_\Omega}{E[Q]} \\ &= \frac{I_+ + I_S}{ E[Q]} \\ &= \frac{I_+}{ E[Q]} + \frac{I_S}{ E[Q]}. \end{align*}$

The upshot is that Team EIL indicated falsely in papers where they applied AAIPQ to evolutionary models that they had mathematical justification for their “conservation of information” rhetoric. Their CoI theorems were in fact irrelevant. I brought this up in one of my “Ask Dr Ewert” questions [posted here in The Skeptical Zone], and Ewert responded with unadulterated bullshit at Uncommon Descent.

Montañez has just provided a new definition of active information — the fourth in nine years, and quite different from the previous ones — along with conservation theorems. There is indeed averaging in the new measure. It serves as a replacement for AAIPQ. The new theorems say nothing relevant to AAIPQ. The release of Introduction to Evolutionary Informatics has been pushed back to January 30, and I’ll hazard a guess that Montañez and Ewert have applied the new measure to the data from the old studies, and that the new measurements will appear in the book. As for the arXiv publication, you know how uncharacteristic that is of stealth creationists. But it makes perfectly good sense if Marks et al. are using the work in the book, and need to cite a source. They’re perhaps hoping to change the reference to a peer-reviewed, forthcoming paper prior to publication of the book.

Montañez says nothing to indicate that his work is revisionary, as is the norm for Team EIL. Nor does he indicate that his Famine of Forte theorem (with conservation of active information redux as a corollary) is analogous to the “Algorithmic Specified Complexity Is Rare” theorem of Ewert et al. (By the way, I proved a closely analogous “Active Information Is Rare” theorem a couple years ago, using the then-current definition of active information, but decided not to share it with Team EIL.) In all sincerity, I had hoped that the outstanding faculty in machine learning at Carnegie Mellon would set Montañez straight. No such luck.

I’ll have to adjust the part of my introduction to evolutionary informatics dealing with average active information per query. The story I have to tell now is more complicated, but also more juicy. Marks et al. are damned if they use the new measure, because it appeared in an unreviewed, self-published paper after World Scientific began taking orders for the book (advertising it as based on peer-reviewed publications). Also, their new calculations will not have passed peer review, even if the paper by Montañez has. They’re damned if they don’t use the new measure, because they obviously were wrong, as I showed above, in suggesting that their CoI theorems of the time were relevant to AAIPQ, and because Montañez has now acknowledged tacitly that they were wrong.

134 thoughts on “Active information defined… for the fourth time in nine years?”

Rumraket on October 1, 2016 at 11:11 pm said:

Patrick: I suggest you read “Arrival of the Fittest”.

He’s talking about Dembski and Marks NFL theorem, not evolution in general.
Joe Felsenstein on October 2, 2016 at 5:11 pm said:

GeorgeMontanez: Something that you may like about the Forte paper is that the set of fitness functions (or more generally, external information resources), denoted (I can’t get mathcal to work), can be any finite precision finite set, such as some set of smooth objective functions. In other words, we can exclude troublesome fitness functions from our optimization process, and the forte result still holds. You can probably give me some good feedback regarding the results, such as if they have relevance to the area you work in. I’d love to get your feedback.

My main interest here is not in the correctness of your proof, or in the definitions of active information per query, but in whether there are any implications of it for evolutionary biology. For the NFL theorem, William Dembski argued that it showed that evolutionary search could not do better than random search. This argument of his was rapidly refuted by multiple people, who pointed out that the NFL theorem described average behavior over all possible assignments of fitnesses to genotypes, not the behavior on realistic fitness functions which were much smoother than those. Dembski’s NFL argument is thus a dead letter.

I need to read your paper more carefully, but if it places a bound on what fraction of all possible fitness surfaces an algorithm can do well on, that bound would have to be very very tight for it to imply that evolution would not succeed on very smooth fitness surfaces. Simple “greedy” hill-climbers can do well on smooth fitness surfaces. On unimodal (one-peak) surfaces they always succeed.

As for restricting ourselves to a collection of smooth fitness surfaces, I don’t see what question about models of evolution is answered. We start with a bunch of surfaces on which those models do well, so I am not sure what your theorem would then imply. But I do need to read your paper more closely.
Mung on October 2, 2016 at 5:20 pm said:

Joe Felsenstein: Simple “greedy” hill-climbers can do well on smooth fitness surfaces. On unimodal (one-peak) surfaces they always succeed.

🙂

All hail the power of cumulative selection. Greedy Weasels!
Joe Felsenstein on October 2, 2016 at 5:25 pm said:

Mung: All hail the power of cumulative selection. Greedy Weasels!

You’re quite right there.
fifthmonarchyman on October 2, 2016 at 5:28 pm said:

Joe Felsenstein: I need to read your paper more carefully, but if it places a bound on what fraction of all possible fitness surfaces an algorithm can do well on, that bound would have to be very very tight for it to imply that evolution would not succeed on very smooth fitness surfaces.

Do you concede that evolution is an algorithmic process? That seems to be a point of contention here.

peace
colewd on October 2, 2016 at 5:40 pm said:

Joe Felsenstein,

Simple “greedy” hill-climbers can do well on smooth fitness surfaces. On unimodal (one-peak) surfaces they always succeed.

How can you claim they alway succeed?
Mung on October 2, 2016 at 5:48 pm said:

colewd: How can you claim they alway succeed?

Because if they fail to succeed, the fitness surface was not smooth and/or had more than one peak. Thus the move to other sorts of hill-climbing algorithms, which have some way to overcome those obstacles.
phoodoo on October 2, 2016 at 5:59 pm said:

Mung,

Fitness measurements in biology are never wrong. They are just recalculated with each generation.
Joe Felsenstein on October 2, 2016 at 8:26 pm said:

I may be repeating what Mung said, but I can easily prove my assertion, if I am allowed a few further minor assumptions: that the space of genotypes is finite (such as being all DNA sequences of length 1000), and that each genotype has at least one neighbor (DNA sequences of length 1000 have 3000 neighbors each). And finally, that no two genotypes have fitnesses that are exactly equal, to avoid there being perfectly flat regions.

Given that, a simple uphill climber that looks at the neighboring sequences and moves to one that has higher fitness will keep moving until it reaches a peak, and there’s only one of those.

Convinced, colewd?
phoodoo on October 2, 2016 at 8:46 pm said:

Joe Felsenstein,

But what does it have to do with biology?
Joe Felsenstein on October 2, 2016 at 9:27 pm said:

phoodoo:
Joe Felsenstein,

But what does it have to do with biology?

fifthmonarchyman: Do you concede that evolution is an algorithmic process? That seems to be a point of contention here.

peace

I’ve spent 50 years working on models in biology. In those models there are algorithms. Whether evolution really “is an algorithmic process” I will leave to philosophers of science. The models and the algoirthms are relevant.

William Dembski set up a model of evolutionary process with functions that got you the next generation (i.e., algorithms). He thought he could use that to prove that evolution could not succeed in making adaptations. (He was wrong and could not prove that).

I bet that both FMM and phoodoo thought those models and algorithms were relevant and to the point!
Mung on October 3, 2016 at 1:58 am said:

Joe Felsenstein: The models and the algoirthms are relevant.

In what sense are they relevant if they do not reflect reality? Are you really going to leave whether or not your models and algorithms reflect reality up to the philosophers?
Richardthughes on October 3, 2016 at 2:32 am said:

Mung: In what sense are they relevant if they do not reflect reality?

Landscape / map. They demonstrate certain attributes of it.
fifthmonarchyman on October 3, 2016 at 2:45 am said:

Joe Felsenstein: I bet that both FMM and phoodoo thought those models and algorithms were relevant and to the point!

You might loose in my case. I think they are relevant but not conclusive by any means. They are good starts but much more needs to be done.

I think we will all agree that what is meant by “evolution” is difficult to pin down. As witnessed by your “Whether evolution really “is an algorithmic process” I will leave to philosophers of science.” comment.

peace
keiths on October 3, 2016 at 2:52 am said:

Joe,

I may be repeating what Mung said, but I can easily prove my assertion, if I am allowed a few further minor assumptions: that the space of genotypes is finite (such as being all DNA sequences of length 1000), and that each genotype has at least one neighbor (DNA sequences of length 1000 have 3000 neighbors each). And finally, that no two genotypes have fitnesses that are exactly equal, to avoid there being perfectly flat regions.

Given that, a simple uphill climber that looks at the neighboring sequences and moves to one that has higher fitness will keep moving until it reaches a peak, and there’s only one of those.

A simple uphill climber can succeed even without the assumption I highlighted above.

Progress will be temporarily stalled while the climber explores the flat region, but it will eventually stumble upon a sequence with higher fitness and resume its climb.

The duration of the delay will correlate with the size of the flat region. Even Mung’s brain-dead “SS Weasel”, whose entire fitness landscape is flat except for a spike at the target, would eventually stumble onto that single “peak” — if the universe lasted long enough.
Mung on October 3, 2016 at 2:53 am said:

Richardthughes: They demonstrate certain attributes of it.

Your claim is that they do reflect reality? Are you, like Joe, willing to leave it up to the philosophers to sort our which attributes of reality computer models and algorithms demonstrate?
Mung on October 3, 2016 at 2:58 am said:

keiths: Even Mung’s brain-dead “SS Weasel”, whose entire fitness landscape is flat except for a spike at the target, would eventually stumble onto that single “peak” — if the universe lasted long enough.

My “brain-dead” Weasel implements cumulative selection. If not, why not?

So should we measure “the power of cumulative selection” in units of time? Why won’t you say?
keiths on October 3, 2016 at 3:27 am said:

Mung:

My “brain-dead” Weasel implements cumulative selection. If not, why not?

See this.

So should we measure “the power of cumulative selection” in units of time? Why won’t you say?

Jesus, Mung:

We can easily demonstrate the power of cumulative selection, and that was true long before Mung started this thread.

For example, my Weasel program allows selection to be turned on or off. That can make the difference between a run that converges in seconds versus one that runs indefinitely and wouldn’t be expected to succeed if it ran for a billion years. The program is the same. The parameters are the same. The only difference is whether selection is enabled.
Richardthughes on October 3, 2016 at 3:45 am said:

Mung: Your claim is that they do reflect reality? Are you, like Joe, willing to leave it up to the philosophers to sort our which attributes of reality computer models and algorithms demonstrate?

Parts of it. Ask yourself, why is it a model and not the real thing. Moreover, what defines a model vs. something unrelated?
Mung on October 3, 2016 at 3:59 am said:

Richardthughes: Ask yourself, why is it a model and not the real thing

I see no need to ask myself that question. It’s a model.

Richardthughes: Moreover, what defines a model vs. something unrelated?

Something unrelated to what? Something unrelated to reality?

Let’s pass that one off to the philosophers!

Joe’s attempt to pass off the question of whether his models and algorithms can tell us what reality is like is a cop-out. It’s disingenuous. He wouldn’t be engaged in it if he didn’t think they represented reality.
Richardthughes on October 3, 2016 at 4:03 am said:

A model aims to capture, simulate or understand a limited subset of attributes from the real world – with lower fidelity that the real examples.

My own, clunky definition.
Joe Felsenstein on October 3, 2016 at 4:14 am said:

Mung: Joe’s attempt to pass off the question of whether his models and algorithms can tell us what reality is like is a cop-out. It’s disingenuous. He wouldn’t be engaged in it if he didn’t think they represented reality.

This misrepresents my statement. The issue was whether reality “is an algorithmic process”. My model has algorithms, and these models come close to reality (and if not, we argue about it and I try better models). But the issue I’ll leave to philosophers is whether reality really is an algorithmic process. As long as the models are pretty good, I’ll leave the “really is” to the philosophers.

The cop-out would be to waste time quibbling about “really is”.
Mung on October 3, 2016 at 4:30 am said:

Joe Felsenstein: The issue was whether reality “is an algorithmic process”.

This is simply and egregiously false.
Mung on October 3, 2016 at 4:40 am said:

Richardthughes: A model aims to capture, simulate or understand a limited subset of attributes from the real world – with lower fidelity that the real examples.

And scientists leave it to the philosophers to decide.
Joe Felsenstein on October 3, 2016 at 4:49 am said:

Mung: Joe Felsenstein: The issue was whether reality “is an algorithmic process”.

This is simply and egregiously false.

True, it wasn’t what the issue was. It was:

Joe Felsenstein: Whether evolution really “is an algorithmic process” I will leave to philosophers of science.

Thanks for the correction. It was not “reality” but “evolution” that was at issue. Presumably real evolution.
OMagain on October 3, 2016 at 8:24 am said:

Mung,

In what sense are they relevant if they do not reflect reality?

Ask those 17 books on your bookshelf….
Patrick on October 3, 2016 at 5:05 pm said:

Rumraket:

I suggest you read “Arrival of the Fittest”.

He’s talking about Dembski and Marks NFL theorem, not evolution in general.

I stand corrected, thanks.
colewd on October 3, 2016 at 5:22 pm said:

Joe Felsenstein,

Given that, a simple uphill climber that looks at the neighboring sequences and moves to one that has higher fitness will keep moving until it reaches a peak, and there’s only one of those.

Convinced, colewd?

I think so. Are you also assuming that moving up the peak directly translates to reproductive success of the organism?
Tom English on October 3, 2016 at 9:42 pm said:

colewd: Are you also assuming that moving up the peak directly translates to reproductive success of the organism?

Joe probably has a better way of putting it, but I will toss off a remark that the uphill trajectory on the fitness surface represents genotypes leaving more copies of themselves than do all of their immediate (different in exactly one base) neighbors in the space of genotypes. A genotype is a type of organism, not an organism.
Joe Felsenstein on October 3, 2016 at 11:50 pm said:

I don’t have a better way of putting it. That was spot on.
Mung on October 5, 2016 at 2:05 am said:

Joe Felsenstein: It was not “reality” but “evolution” that was at issue. Presumably real evolution.

That’s right. And do you really leave it up to the philosophers to decide? Did they choose the models you should use, or did you?

Put another way, before creating these models and algorithms, did you consult with any philosophers to get their approval?
Mung on October 5, 2016 at 2:17 am said:

Let me add a few comments here.

It’s always possible that I might be full of shit, that I don’t have a clue, etc. If people think I am, I don’t mind at all if they speak up, but they should explain how it is so, and not merely say it is so. I am capable of learning, and of being corrected, and of admitting I was wrong.

I think I am justified in despising comments that portray me and other theists here as incapable of being wrong, or wedded to certain beliefs because of our faith, or any other “explanation’ that people can put forth in order to avoid actual debate on the issues.

Joe’s model’s and algorithms are based upon mathematics, and that lends to their being implemented in computers. Who validated their mathematical models against reality? Which philosophers signed off on their mathematical models?

Honestly, I don’t think science works that way.

Whether evolution is algorithmic may in fact be a philosophical question, but I deny that scientists, like Joe, leave it up to the philosophers. They [scientists] believe their models and algorithms reflect reality, and if the philosophers disagree, so much the worse for philosophy. Right KN?
fifthmonarchyman on October 5, 2016 at 2:27 am said:

Joe Felsenstein: Thanks for the correction. It was not “reality” but “evolution” that was at issue. Presumably real evolution.

So if folks like Montañez were to discover that algorithms have certain limitations would that possibly have any relevance at all to the capabilities of “real” evolution?

peace
Tom English on October 6, 2016 at 2:17 am said:

fifthmonarchyman: So if folks like Montañez were to discover that algorithms have certain limitations would that possibly have any relevance at all to the capabilities of “real” evolution?

Duh. Some folks speculate that all of physical reality can be described accurately. What I call speculation is an article of faith for some folks who have made science into a surrogate religion. Has God granted folks like you knowledge that the speculation is indeed correct?

In chaos theory, we can know perfectly the physical law governing the temporal evolution of a deterministic system, and yet be unable to predict the trajectory of the system over the long term, due to unavoidable error in measurement of initial conditions. I am not saying that reality “really” is this way — how I would I confirm that a physical quantity “really” does range over an uncountable set, let alone a continuum? — but instead that nobody has laid to rest the concerns raised by chaos theory.

Given that Montañez is (still) talking about “conservation of information,” I’ll mention that chaos theorists use the term “creation of information” to mean that the evolving system provides information about its initial state that is unobtainable by direct measurement of the initial state.