Active information defined… for the fourth time in nine years?

Yesterday, a couple of folks let me know of a paper that crypto-creationist [ETA: perhaps under reform] George Montañez had just posted at arXiv, “The Famine of Forte: Few Search Problems Greatly Favor Your Algorithm.” Below you’ll find my response to one of them. I should explain a few things, by way of introduction.

Montañez is a former advisee of the “Charles Darwin of intelligent design,” Baylor University professor Robert J. Marks II. Last I heard, he was pursuing doctoral studies in machine learning at Carnegie Mellon University. He worked not only with Marks, but also with William A. Dembski, the “Isaac Newton of information theory,” and Winston Ewert, the “Pooh Bear of evolutionary informatics,” on applications of measures of active information. He is still affiliated with them at the Evolutionary Informatics Lab. I refer to the core of affiliates who actually contribute to the output of the Lab — Marks, Dembski, Ewert, and Montañez — as Team EIL. The first three of them have a book scheduled for release by World Scientific on January 30, 2017. The title is Introduction to Evolutionary Informatics. I am trying to pull together a series of posts with the same title.

My email note follows.

[ETA: George Montañez has kindly responded here at TSZ. Contrary to what I guess below, he is not presently collaborating with the authors of the book.]


There’s an important development in the paper. In their studies of ev, Avida, etc., Team EIL did not apply their measures of active information, for which they had “conservation of information” (CoI) theorems, but instead their egregiously misnamed measures of average active information per query (AAIPQ). In it’s simplest form, AAIPQ is the endogenous (not active) information  $I_\Omega$ divided by the expected number of queries for the sampling process to first hit the target,

     $$I_\oplus = \frac{I_\Omega}{E[Q]}.$$

Active information was defined as the difference of endogenous information and exogenous information,

     $$I_+ = I_\Omega - I_S,$$

and endogenous information was thus the sum of active information and exogenous information,

     $$I_\Omega = I_+ + I_S.$$

You don’t even need to know the definitions of endogenous information and exogenous information to see that AAIPQ is active information per expected number of queries plus exogenous information per expected number of queries,

     \begin{align*} I_\oplus     &= \frac{I_\Omega}{E[Q]} \\    &= \frac{I_+ + I_S}{ E[Q]} \\     &= \frac{I_+}{ E[Q]} + \frac{I_S}{ E[Q]}. \end{align*}

The upshot is that Team EIL indicated falsely in papers where they applied AAIPQ to evolutionary models that they had mathematical justification for their “conservation of information” rhetoric. Their CoI theorems were in fact irrelevant. I brought this up in one of my “Ask Dr Ewert” questions [posted here in The Skeptical Zone], and Ewert responded with unadulterated bullshit at Uncommon Descent.

Montañez has just provided a new definition of active information — the fourth in nine years, and quite different from the previous ones — along with conservation theorems. There is indeed averaging in the new measure. It serves as a replacement for AAIPQ. The new theorems say nothing relevant to AAIPQ. The release of Introduction to Evolutionary Informatics has been pushed back to January 30, and I’ll hazard a guess that Montañez and Ewert have applied the new measure to the data from the old studies, and that the new measurements will appear in the book. As for the arXiv publication, you know how uncharacteristic that is of stealth creationists. But it makes perfectly good sense if Marks et al. are using the work in the book, and need to cite a source. They’re perhaps hoping to change the reference to a peer-reviewed, forthcoming paper prior to publication of the book.

Montañez says nothing to indicate that his work is revisionary, as is the norm for Team EIL. Nor does he indicate that his Famine of Forte theorem (with conservation of active information redux as a corollary) is analogous to the “Algorithmic Specified Complexity Is Rare” theorem of Ewert et al. (By the way, I proved a closely analogous “Active Information Is Rare” theorem a couple years ago, using the then-current definition of active information, but decided not to share it with Team EIL.) In all sincerity, I had hoped that the outstanding faculty in machine learning at Carnegie Mellon would set Montañez straight. No such luck.

I’ll have to adjust the part of my introduction to evolutionary informatics dealing with average active information per query. The story I have to tell now is more complicated, but also more juicy. Marks et al. are damned if they use the new measure, because it appeared in an unreviewed, self-published paper after World Scientific began taking orders for the book (advertising it as based on peer-reviewed publications). Also, their new calculations will not have passed peer review, even if the paper by Montañez has. They’re damned if they don’t use the new measure, because they obviously were wrong, as I showed above, in suggesting that their CoI theorems of the time were relevant to AAIPQ, and because Montañez has now acknowledged tacitly that they were wrong.

134 thoughts on “Active information defined… for the fourth time in nine years?

  1. Tom English: And you cannot tell me that it does not matter what you call the quantity, because you use the name, and nothing but the name, to invoke “conservation of [active] information” in papers where you measure average endogenous information per query (and take the reciprocal of the measurement).

    Oops. Now you’ve got me doing it. That should end, “… where you measure the average number of queries per bit of endogenous information (and take the reciprocal).”

  2. petrushka: If Wolfram accomplished nothing else with his automata, he demonstrated that a very simple algorithm can produce unpredictable output.

    I don’t endorse his “new kind of science.” But his studies came to mind repeatedly when I was writing my comment.

  3. I like Sal’s response. The main players are his best friends but they never bothered to explain the math to him. Perhaps they assumed that he was qualified.

    They never explained the math to me either. But then, Sal never explained the math to me either, and Sal and I are best buddies.

  4. Alan Fox: Moderation issues can and should be discussed in the appropriate thread.

    LoL! Oh sure, go there and “discuss” moderation issues with Patrick and Neil. Or get a root canal. Get castrated. Place your brain in a vat. …

  5. Joe Felsenstein: …and Uncommon Descent should not so often ban people when they do an effective job of refuting their posts.

    Are you banned at UD, Joe, or do you just choose to not post there?

  6. GeorgeMontanez: Hi Tom and everyone else,

    First time here! First off, thanks for the interest in my work. Having people discuss your work (even critically) is better than having a stack of papers nobody reads.

    Hi George, and welcome to TSZ. You’ll find two kinds of criticism here at TSZ.

    Criticism based on ignorance or bias.
    Criticism that asserts that IDists are immune to criticism.

  7. Mung,

    How dare you Mung? Don’t you know Joe is one of most highly cited and well regarded biologists of the past 50 years? Who are you to question where he has or hasn’t been banned?

  8. Joe Felsenstein: Before we try to understand whether Montañez’s Expected Active Information Per Search measure is well-defined, is useful, or is new, let me make a couple of points about what Montañez’s version of the No Free Lunch theorem implies for evolution:

    Basically, nothing.

    Totally agree. The NFL work all assume the search space has not structure (unlike DNA and protein space). No matter what they mathematically prove, it has exactly zero relevance to biology.

    I’m not sure how anyone can take this work seriously in biology. Has it ever been applied by a non-ID person usefully in biology?

  9. Mung,

    Well, and don’t forget there is also the criticism that all IDist’s are really just preachers in disguise.

    That is actually going to be the topic for this Sundays service at the Skeptics church that Patrick and Richard are organizing. Reverend Omagain will be presiding I believe. Tom English is going to be the alter boy.

  10. Richardthughes: And yours are as if written by a syphilitic imbecile who’s tiny mind is controlled by a gerbil residing in his rectum. Not a very clever rhetorical, rule-skirting trick either way.

    I just love how this gets past the mods.

  11. Tom English: There’s now a note in the OP, above the fold, telling new readers that my guess is wrong, before they read it.

    Another anti-ID prediction falsified. How many more do “the skeptics” need?

  12. Dr. Swamidass has clearly failed to do an effective job of refuting their posts [at UD], else he would be banned at UD.

  13. Joe Felsenstein has clearly failed to do an effective job of refuting their posts [at UD], else he would be banned at UD.

    Tom English has clearly failed to do an effective job of refuting their posts [at UD], else he would be banned at UD.

  14. Dave Carlson: The mind reels.

    But here is the bigger problem with your assertion Dave.

    In many fields of academia, the requirements for success are not the same as in the real world of business and performance. What is “good” in academia, being popular? Like the Kardashians maybe?

    In others aspects of career work, people are judged by results. Did they produce something useful. Were they the best at achieving the objective. But there is no such objective in most of academia (not all of course, because in some science fields, academics make discoveries, they engineer new useful items and technologies). But not in Joe’s field. He just can make claims that he is right. He can do this for 50 years. Does he have to actually be right? Of course not, that’s not a requirement of his job. Jerry Coyne is a famous academic. He is an idiot. Lawrence Krauss is a famous academic. He is a loudmouth, self-promoting idiot. That’s useful in academia. Its useful in politics, so Donald Trump is also well know and well quoted.

    It has nothing whatsoever with being right however.

    That this impresses you says more about you than me Dave.

  15. Richardthughes,

    And Kim Kardashian is even smarter than Jerry Coyne. Maybe the University of Chicago can give her a position.

    You can take classes from her to learn how to tape your nipples properly Richard.

  16. phoodoo,

    What an ignorant thing today. His accomplishments are in fields requiring intellectual rigor . You make my point for me.

  17. Joe Felsenstein,
    Sorry to put you off so long, Joe. The problem is that you’ve packed a lot into a few words, and I have too much I might say in response.

    Joe Felsenstein: Before we try to understand whether Montañez’s Expected Active Information Per Search measure is well-defined, is useful, or is new, let me make a couple of points about what Montañez’s version of the No Free Lunch theorem…

    I’m going to half-agree with George here, and say that it’s not a version of the NFL theorem, which by convention is an obvious corollary to the first NFL theorem of Wolpert and Macready, easier to state in plain language (Wolpert and Macready did not bother to formalize it) than is the theorem itself. I’m going to half-agree with you, and say that the Famine of Forte theorem is similar to one of the theorems in “No Free Lunch Theorems for Optimization” (if memory serves — an increasingly dubious proposition).

    Joe Felsenstein: … what Montañez’s version of the No Free Lunch theorem implies for evolution:

    Basically, nothing.

    I agree. George probably does not. But I accept that he is now focused on learning, and suspect that he wants to support an argument that minds do what learning algorithms cannot. I will not accept that his latest work is irrelevant to his past criticism of evolutionary models, just because he’s now focused on learning.

    Joe Felsenstein: The original NFL was not aimed at evolution, but William Dembski argued in his 2002 book No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence that it showed that evolutionary processes could on average do no better than ones that made changes completely at random.

    Wolpert and Macready took aim at the notion that biological evolution is an “optimized optimizer,” and that we can benefit by mimicking evolution in optimization, i.e., without exploiting advance knowledge of the function to be optimized (metaphorically, treating the function as a black box). They make this clear toward the end of their 1995 technical report, “No Free Theorems for Search.” A big part of Dembski’s game was to substitute evolutionary algorithms (used to solve problems) for evolutionary models (used to understand evolution). He gets nowhere with his talk about “smuggling in information” unless he smuggles in purpose for evolutionary processes.

    Joe Felsenstein: This was immediately refuted by critics (starting with Richard Wein and Jason Rosenhouse).Here is my attempt to explain what they discovered. Basically, the NFL asks about average behavior over all ways that fitnesses could be associated with genotypes. Most of those are “white noise” fitness surfaces. Real fitness surfaces are much smoother than that, because physics.

    You’re correct in your recollection. But I should point out that George is well aware of the typical function. He just doesn’t explain himself clearly in the part of the paper where he addresses it. [George, I don’t believe that anyone who doesn’t already know what you’re talking about will know what you’re talking about.] He recognizes that the vast majority of samplers (“search algorithms”) are close to average in their performance on the typical function. I have to give him credit for understanding something that most people addressing black-box search do not.

    Joe Felsenstein: Tom and I have also addressed here whether the Conservation of Information theorems of Dembski, Ewert, and Marks imply that on average “evolutionary searches” perform no better than random.They prove this by including in evolutionary searches all sorts of crazy ones, including ones that deliberately try to do badly.Once we eliminate all of the crazy ones, and confine ourselves to ones that have genotypes that have fitnesses, their result collapses.

    As far as I can see Montañez’s results don’t change any of this. Am I right about that, Tom?

    I’m afraid my answer is trivial: Montañez changed the measure of active information, so his results do not apply to what we addressed. I don’t know yet whether someone might turn his results into an argument against evolution that the GUC Bug might be used to counter.

  18. Tom English: They make this clear toward the end of their 1995 technical report, “No Free Theorems for Search.”

    Computer programs don’t search and there is no such thing as a search algorithm. Problem solved.

  19. From the OP:

    In it’s simplest form, AAIPQ is the endogenous (not active) information I_\Omega divided by the expected number of queries for the sampling process to first hit the target,

    There are no targets in evolution. There are no targets in programs that purport to simulate evolution.

  20. swamidass: The NFL work all assume the search space has not structure

    I’m glad to have you commenting at TSZ, Joshua. A gentle reminder: You know better than to make sweeping claims about a field you have not surveyed.

  21. Mung: From the OP:

    In it’s simplest form, AAIPQ is the endogenous (not active) information I_\Omega divided by the expected number of queries for the sampling process to first hit the target,

    There are no targets in evolution. There are no targets in programs that purport to simulate evolution.

    The sampling process, as modeled by Team EIL, does not depend on the target. The target is “where” in the sample space \Omega the measurer chooses to measure. When the target T \subset \Omega of the measurement is understood, we write I_\Omega as an abbreviation of I_\Omega(T), Q as an abbreviation of Q_T, and I_\oplus as an abbreviation of I_\oplus(T). The target of the measurement is not the target of the sampling process.

    Let’s say that the sampling process X_1, X_2, X_3, \ldots continues indefinitely. At each step n in time, X_n is a random function of f(X_1), f(X_2), \ldots, f(X_{n-1}), where f is (to keep things simple) the fitness function. Now, we can measure I_\Omega(T) for T = \{x \in \Omega \mid f(x) > 1\}. And if the random value Q_T is the least n such that X_n is in T, then

        \[I_\oplus(T) = \frac{I_\Omega(T)}{E[Q_T]}.\]

    Similarly, if T^\prime = \{x \in \Omega \mid f(x) \leq 1\}, then

        \[I_\oplus(T^\prime) = \frac{I_\Omega(T^\prime)}{E[Q_{T^\prime}]}.\]

    Note that T^\prime is the complement of T relative to the sample space \Omega. Every element of the sample space is targeted either in the measurement I_\oplus(T) or in the measurement I_\oplus(T^\prime), but not both. I’m trying to drive home the point that the target of the measurement is not the target of the sampling process.

  22. Mung: With Evolution, nothing is impossible.

    You don’t believe that, I don’t believe that, no person here believes that. So what was the point of your non-answer?

  23. Mung:
    Joe Felsenstein has clearly failed to do an effective job of refuting their posts [at UD], else he would be banned at UD.

    Tom English has clearly failed to do an effective job of refuting their posts [at UD], else he would be banned at UD.

    So, if they float, they must be witches, huh?

  24. Tom English: I’m glad to have you commenting at TSZ, Joshua. A gentle reminder: You know better than to make sweeping claims about a field you have not surveyed.

    Good point. I am referring to the early work by Dembski whichever certainly isnt the whole field. Thanks for your work on this.

  25. Hi Tom,

    Thank you for your reply. I appreciate your willingness to correct yourself on points you may have gotten wrong. I will try to do the same.

    You touch on a lot in your post, but I’ll begin with what I feel is the most relevant issue (and one where I can concede your main point): the difference between the active information defined in 5.2 (which has an element of randomness, stemming from the resource query constraint) and the historical active information, which as you point out, is given as a non-random quantity since the implicit resource constraint is either held constant (which seems to be the case), or marginalized out, leaving a simple probability. You were right regarding a difference. I retract the following statement:

    “If you read the manuscript, you’ll see the definition of active information (given in section 5.2) remains unchanged.”

    It is subtly different, so let’s see where this difference comes from. Basically, to say one is considering the probability of success simpliciter, you leave undefined the implicit dependence on the number of queries allowed, since that alters the probability of success. At that point, you must decide whether you want to consider a per-query probability of success, the probability of success within Q queries where Q is fixed, or probability of success within Q queries where Q is random and you have marginalized out the randomness. I used the first interpretation in my new paper (since I find it helpful to explicitly separate out the randomness), whereas Dembski & Marks use the second (or third) interpretation. So your sentence:

    “Either way, the meaning of q in your expression is different from the meaning of q in the three previous definitions of active information”

    is correct: I am considering a different interpretation of the query constraint. My definition used in Corollary 5.2 is different from what we can call the “canonical” definition. Luckily, this is a preprint, so this can be fixed. I have a couple options here. I believe conservation theorems have already been proven for the canonical active information, so no need to redo that, leaving a few options for the corollary in the preprint: 1) we can keep the form in the paper and give it a distinct name (I’m open to suggestions for naming): namely, -\log_2(\frac{p}{q}) where q is the per query probability of success, which depends on randomization in the number of queries taken and randomization in search histories, making it a random quantity that we take the expectation of; or 2) we can use the expected per-query probability of success q(T,F), and prove the conservation theorem for -\log_2(\frac{p}{q(T,F)}). Which do you guys feel would be more useful? I’m open to feedback on this.

    The second main point of your reply seems to center on the use of AAIPQ, where you write: “You have no mathematical justification for saying that the quantity I_{\oplus} is conserved.” I don’t believe I am using AAIPQ in the paper and am not claiming it is conserved, so this may be residual fallout from my subtle re-interpretation of active info, which may be why you seem to link the past work to my current work, such as writing:

    ‘And you cannot tell me that it does not matter what you call the quantity, because you use the name, and nothing but the name, to invoke “conservation of [active] information” in papers where you measure average endogenous information per query (and take the reciprocal of the measurement).’

    In my first response I was saying that p/q_min (relating to q(T,F), the expected per query probability of success) is not explicitly given a name in the paper (which I don’t think it is). It is just a quantity that arises in proving the bound.

    More to your original point, in my independent work I haven’t used AAIPQ, so perhaps your point is about the ev paper. You suggest a measure of expected number of queries per bit of endogenous information be used, which seems reasonable and well-defined. Had I been more involved in framing the definitions used of the ev paper (and had your suggestion!) perhaps we could have measured that quantity in the experiments as well. Instead the endogenous information (measuring the inherent problem difficulty) is divided by the mean number of queries for the alternative algorithm. It is a nonrandom quantity and easy to estimate, which is probably why we used it. As you point out, this is different than the endogenous information multiplied by the expected reciprocal number of queries. Was that conflation made somewhere? If so, then it would be an error. If not, then what remains to be addressed is if I claimed AAIPQ is conserved without showing it. I don’t believe I have, but could be missing something written in the ev paper. I am almost certain I have not said so in my forte paper, unless you were thrown off by the conflation of canonical active information with the randomized version.

    What I’m trying to accomplish in the new paper is give a framework that allows us to avoid the pitfalls of subtle dependence on query constraints (by normalizing them out) and prove and estimate things in a more straightforward manner. This is still preliminary work, but so far, so good. Feedback from my fellow ML and CS researchers is critical for doing so successfully, so again, thanks for the discussion of the work.

    Sorry for the long reply, but I didn’t want to avoid tough questions. Two last requests: don’t hold back results or suggestions! You have my email and we’ve had cordial exchanges in the past, so don’t hesitate to write. You can share new results or simply make suggestions on my work that you come across. Feedback can be incorporated to improve works-in-progress, and I can only benefit from increased feedback. Second, be your best version of you when you write: don’t resort to name-calling. You know it isn’t professional and definitely isn’t kind. So let’s raise the bar on civility (even as this thread dissolves into a pool of name-calling around us). Instead of referring to me as a “crypto-creationist”, you can say “Christian data scientist”, “budding data scientist”, “machine learning researcher”, “PhD candidate”, “grad student”, or lots of other things. You’re an adult, so it is ultimately up to you what you call me. I’ll try to have a thick skin. Just take it as a friendly request. Thanks!

  26. swamidass: Totally agree. The NFL work all assume the search space has not structure (unlike DNA and protein space). No matter what they mathematically prove, it has exactly zero relevance to biology.

    Well, NFL allows the space to have structure, such as there being a space of DNA sequences. Then it talks about average behavior of a search algorithm over all possible ways you could assign a set of fitnesses to the sequences. On average your nearest neighbor is then just a random fitness randomly chosen from the whole set. So one step (one mutation) is typically totally disastrous.

    Real fitness surfaces are much smoother than that. But No Free Lunch (as used by Dembski) averages them in with lots of other fitness surfaces that are just “white noise”, all of which have mutation be totally disastrous.

  27. Joe Felsenstein,

    Hi Joe,

    Something that you may like about the Forte paper is that the set of fitness functions (or more generally, external information resources), denoted B_m (I can’t get mathcal to work), can be any finite precision finite set, such as some set of smooth objective functions. In other words, we can exclude troublesome fitness functions from our optimization process, and the forte result still holds. You can probably give me some good feedback regarding the results, such as if they have relevance to the area you work in. I’d love to get your feedback.

  28. GeorgeMontanez:
    As an aside, can someone tell me the proper way to add math comments here? (Double dollar sign doesn’t seem to work…)

    \LaTeX is enabled for posts and comments. You should just need to preface and end with dollar signs, single for in-line and double for its own line. Test: x^2

    Snag is there’s no preview!

  29. GeorgeMontanez: Second, be your best version of you when you write: don’t resort to name-calling. You know it isn’t professional and definitely isn’t kind. So let’s raise the bar on civility (even as this thread dissolves into a pool of name-calling around us). Instead of referring to me as a “crypto-creationist”, you can say “Christian data scientist”, “budding data scientist”, “machine learning researcher”, “PhD candidate”, “grad student”, or lots of other things. You’re an adult, so it is ultimately up to you what you call me. I’ll try to have a thick skin. Just take it as a friendly request. Thanks!

    I agree with this point entirely, and this is why I have a big problem with Tom English. I think it is irrelevant what someone like Richardhuges or similar people write, because they aren’t intending to be serious here. But Tom is supposed to be an academic from what I understand. Can you imagine someone who writes like him on a public blog, being employed to teach a diverse group of people? This isn’t a new writing style for him, he always writes like this as if its cute. And he is not even good at it. At least if it was clever, but he doesn’t even rise to that level. He is more at the Trump level.

    Imagine if you attended a debate event and the participants talked like Tom. When it is completely unprovoked. What would your opinion be of them? I can’t imagine him thinking he deserves respect back.

  30. Alan Fox,

    Gracias! I still am figuring out some LaTeX things on here (like mathcal?), but the single dollar sign seems to have done the trick for general LaTeX. I was initially thrown off by the lack of preview (I assumed it wasn’t working).

  31. Alan Fox,

    Here? Initially? From whom?

    And besides what does that even have to do with his critiquing another colleagues paper? Is he competing with Lawrence Krauss to see who can be a bigger ass?

  32. Joe Felsenstein: Real fitness surfaces are much smoother than that. But No Free Lunch (as used by Dembski) averages them in with lots of other fitness surfaces that are just “white noise”, all of which have mutation be totally disastrous.

    Just a question for which I don’t have the answer, and which I haven’t seen asked:

    Do we have enough actual biological data to estimate what percentage of mutations are deleterious, neutral, and favorable, and if so, do we have enough data to estimate the relative magnitude of effects? It seems to me that such estimates, if based on observed facts, would provide the basis for a realistic model fitness surface.

  33. petrushka,

    In what, in bacteria? In humans?

    That is an impossibility. Deleterious when? Advantageous when, for whom? This is the whole problem with the fitness concept. Is being short an advantage or disadvantage? Fat, skinny, dumb smart, psychotic…how can anyone say what these mutations might help or hurt.

    Existing is fitness. 1 or 0.

  34. Joe Felsenstein: Well, NFL allows the space to have structure, such as there being a space of DNA sequences. Then it talks about average behavior of a search algorithm over all possible ways you could assign a set of fitnesses to the sequences. On average your nearest neighbor is then just a random fitness randomly chosen from the whole set. So one step (one mutation) is typically totally disastrous.

    So I like how you put this. This is much more precise than my statement about “no structure.” I guess my point is that there is no real attemp to actually model the actual search space of biology.

  35. swamidass: So I like how you put this. This is much more precise than my statement about “no structure.” I guess my point is that there is no real attemp to actually model the actual search space of biology.

    I suggest you read “Arrival of the Fittest”.

  36. GeorgeMontanez: So let’s raise the bar on civility

    Now that is some practical advice we should all take. Life is too unpleasant already to add unpleasantness when it’s not necessary.

    peace

  37. phoodoo: That is an impossibility. Deleterious when? Advantageous when, for whom? This is the whole problem with the fitness concept. Is being short an advantage or disadvantage? Fat, skinny, dumb smart, psychotic…how can anyone say what these mutations might help or hurt.

    How does it work in the immaterial world?

Leave a Reply