Does gpuccio’s argument that 500 bits of Functional Information implies Design work?

Posted on May 21, 2018 by Joe Felsenstein

On Uncommon Descent, poster gpuccio has been discussing “functional information”. Most of gpuccio’s argument is a conventional “islands of function” argument. Not being very knowledgeable about biochemistry, I’ll happily leave that argument to others.

But I have been intrigued by gpuccio’s use of Functional Information, in particular gpuccio’s assertion that if we observe 500 bits of it, that this is a reliable indicator of Design, as here, about at the 11th sentence of point (a):

… the idea is that if we observe any object that exhibits complex functional information (for example, more than 500 bits of functional information ) for an explicitly defined function (whatever it is) we can safely infer design.

I wonder how this general method works. As far as I can see, it doesn’t work. There would be seem to be three possible ways of arguing for it, and in the end; two don’t work and one is just plain silly. Which of these is the basis for gpuccio’s statement? Let’s investigate …

A quick summary

Let me list the three ways, briefly.

(1) The first is the argument using William Dembski’s (2002) Law of Conservation of Complex Specified Information. I have argued (2007) that this is formulated in such a way as to compare apples to oranges, and thus is not able to reject normal evolutionary processes as explanations for the “complex” functional information. In any case, I see little sign that gpuccio is using the LCCSI.

(2) The second is the argument that the functional information indicates that only an extremely small fraction of genotypes have the desired function, and the rest are all alike in totally lacking any of this function. This would prevent natural selection from following any path of increasing fitness to the function, and the rareness of the genotypes that have nonzero function would prevent mutational processes from finding them. This is, as far as I can tell, gpuccio’s islands-of-function argument. If such cases can be found, then explaining them by natural evolutionary processes would indeed be difficult. That is gpuccio’s main argument, and I leave it to others to argue with its application in the cases where gpuccio uses it. I am concerned here, not with the islands-of-function argument itself, but with whether the design inference from 500 bits of functional information is generally valid.

We are asking here whether, in general, observation of more than 500 bits of functional information is “a reliable indicator of design”. And gpuccio’s definition of functional information is not confined to cases of islands of function, but also includes cases where there would be a path to along which function increases. In such cases, seeing 500 bits of functional information, we cannot conclude from this that it is extremely unlikely to have arisen by normal evolutionary processes. So the general rule that gpuccio gives fails, as it is not reliable.

(3) The third possibility is an additional condition that is added to the design inference. It simply declares that unless the set of genotypes is effectively unreachable by normal evolutionary processes, we don’t call the pattern “complex functional information”. It does not simply define “complex functional information” as a case where we can define a level of function that makes probability of the set less than $2^{-500}$ . That additional condition allows us to safely conclude that normal evolutionary forces can be dismissed — by definition. But it leaves the reader to do the heavy lifting, as the reader has to determine that the set of genotypes has an extremely low probability of being reached. And once they have done that, they will find that the additional step of concluding that the genotypes have “complex functional information” adds nothing to our knowledge. CFI becomes a useless add-on that sounds deep and mysterious but actually tells you nothing except what you already know. So CFI becomes useless. And there seems to be some indication that gpuccio does use this additional condition.

Let us go over these three possibilities in some detail. First, what is the connection of gpuccio’s “functional information” to Jack Szostak’s quantity of the same name?

Is gpuccio’s Functional Information the same as Szostak’s Functional Information?

gpuccio acknowledges that gpuccio’s definition of Functional Information is closely connected to Jack Szostak’s definition of it. gpuccio notes here:

Please, not[e] the definition of functional information as:

“the fraction of all possible configurations of the system that possess a degree of function >=
Ex.”

which is identical to my definition, in particular my definition of functional information as the
upper tail of the observed function, that was so much criticized by DNA_Jock.

(I have corrected gpuccio’s typo of “not” to “note”, JF)

We shall see later that there may be some ways in which gpuccio’s definition
is modified from Szostak’s. Jack Szostak and his co-authors never attempted any use of his definition to infer Design. Nor did Leslie Orgel, whose Specified Information (in his 1973 book The Origins of Life) preceded Szostak’s. So the part about design inference must come from somewhere else.

gpuccio seems to be making one of three possible arguments;

Possibility #1 That there is some mathematical theorem that proves that ordinary evolutionary processes cannot result in an adaptation that has 500 bits of Functional Information.

Use of such a theorem was attempted by William Dembski, his Law of Conservation of Complex Specified Information, explained in Dembski’s book No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence (2001). But Dembski’s LCCSI theorem did not do what Dembski needed it to do. I have explained why in my own article on Dembski’s arguments (here). Dembski’s LCCSI changed the specification before and after evolutionary processes, and so he was comparing apples to oranges.

In any case, as far as I can see gpuccio has not attempted to derive gpuccio’s argument from Dembski’s, and gpuccio has not directly invoked the LCCSI, or provided a theorem to replace it. gpuccio said in a response to a comment of mine at TSZ,

Look, I will not enter the specifics of your criticism to Dembski. I agre with Dembski in most things, but not in all, and my arguments are however more focused on empirical science and in particular biology.

While thus disclaiming that the argument is Dembski’s, on the other hand gpuccio does associate the argument with Dembski here by saying that

Of course, Dembski, Abel, Durston and many others are the absolute references for any discussion about functional information. I think and hope that my ideas are absolutely derived from theirs. My only purpose is to detail some aspects of the problem.

and by saying elsewhere that

No generation of more than 500 bits has ever been observed to arise in a non design system (as you know, this is the fundamental idea in ID).

That figure being Dembski’s, this leaves it unclear whether gpuccio is or is not basing the argument on Dembski’s. But gpuccio does not directly invoke the LCCSI, or try to come up with some mathematical theorem that replaces it.

So possibility #1 can be safely ruled out.

Possibility #2. That the target region in the computation of Functional Information consists of all of the sequences that have nonzero function, while all other sequences have zero function. As there is no function elsewhere, natural selection for this function then cannot favor sequences closer and closer to the target region.

Such cases are possible, and usually gpuccio is talking about cases like this. But gpuccio does not require them in order to have Functional Information. gpuccio does not rule out that the region could be defined by a high level of function, with lower levels of function in sequences outside of the region, so that there could be paths allowing evolution to reach the target region of sequences.

An example in which gpuccio recognizes that lower levels of function can exist outside the target region is found here, where gpuccio is discussing natural and artificial selection:

Then you can ask: why have I spent a lot of time discussing how NS (and AS) can in some cases add some functional information to a sequence (see my posts #284, #285 and #287)

There is a very good reason for that, IMO.

I am arguing that:

1) It is possible for NS to add some functional information to a sequence, in a few very specific cases, but:

2) Those cases are extremely rare exceptions, with very specific features, and:

3) If we understand well what are the feature that allow, in those exceptional cases, those limited “successes” of NS, we can easily demonstrate that:

4) Because of those same features that allow the intervention of NS, those scenarios can never, never be steps to complex functional information.

Jack Szostak defined functional information by having us define a cutoff level of function to define a set of sequences that had function greater than that, without any condition that the other sequences had zero function. Neither did Durston. And as we’ve seen gpuccio associates his argument with theirs.

So this second possibility could not be the source of gpuccio’s general assertion about 500 bits of functional information being a reliable indicator of design, however much gpuccio concentrates on such cases.

Possibility #3. That there is an additional condition in gpuccio’s Functional Information, one that does not allow us to declare it to be present if there is a way for evolutionary processes to achieve that high a level of function. In short, if we see 500 bits of Szostak’s functional information, and if it can be put into the genome by natural evolutionary processes such as natural selection then for that reason we declare that it is not really Functional Information. If gpuccio is doing this, then gpuccio’s Functional Information is really a very different animal than Szostak’s functional information.

Is gpuccio doing that? gpuccio does associate his argument with William Dembski’s, at least in some of his statements. And William Dembski has defined his Complex Specified Information in this way, adding the condition that it is not really CSI unless it is sufficiently improbable that it be achieved by natural evolutionary forces (see my discussion of this here in the section on “Dembski’s revised CSI argument” that refer to Dembski’s statements here). And Dembski’s added condition renders use of his CSI a useless afterthought to the design inference.

gpuccio does seem to be making a similar condition. Dembski’s added condition comes in via the calculation of the “probability” of each genotype. In Szostak’s definition, the probabilities of sequences are simply their frequencies among all possible sequences, with each being counted equally. In Dembski’s CSI calculation, we are instead supposed to compute the probability of the sequence given all evolutionary processes, including natural selection.

gpuccio has a similar condition in the requirements for concluding that complex
functional information is present: We can see it at step (6) here:

If our conclusion is yes, we must still do one thing. We observe carefully the object and what we know of the system, and we ask if there is any known and credible algorithmic explanation of the sequence in that system. Usually, that is easily done by excluding regularity, which is easily done for functional specification. However, as in the particular case of functional proteins a special algorithm has been proposed, neo darwininism, which is intended to explain non regular functional sequences by a mix of chance and regularity, for this special case we must show that such an explanation is not credible, and that it is not supported by facts. That is a part which I have not yet discussed in detail here. The necessity part of the algorithm (NS) is not analyzed by dFSCI alone, but by other approaches and considerations. dFSCI is essential to evaluate the random part of the algorithm (RV). However, the short conclusion is that neo darwinism is not a known and credible algorithm which can explain the origin of even one protein superfamily. It is neither known nor credible. And I am not aware of any other algorithm ever proposed to explain (without design) the origin of functional, non regular sequences.

In other words, you, the user of the concept, are on your own. You have to rule out that natural selection (and other evolutionary processes) could reach the target sequences. And once you have ruled it out, you have no real need for the declaration that complex functional information is present.

I have gone on long enough. I conclude that the rule that observation of 500 bits of functional information is present allows us to conclude in favor of Design (or at any rate, to rule out normal evolutionary processes as the source of the adaptation) is simply nonexistent. Or if it does exist, it is as a useless add-on to an argument that draws that conclusion for some other reason, leaving the really hard work to the user.

Let’s end by asking gpuccio some questions:
1. Is your “functional information” the same as Szostak’s?
2. Or does it add the requirement that there be no function in sequences that
are outside of the target set?
3. Does it also require us to compute the probability that the sequence arises as a result of normal evolutionary processes?

1,971 thoughts on “Does gpuccio’s argument that 500 bits of Functional Information implies Design work?”

Mung on July 3, 2018 at 8:51 pm said:

DNA_Jock: You still don’t understand FI.

I gave Rumraket an example where I calculated FI. I didn’t need to calculate the FI for any specific sequence to do that. All I needed to know was whether its level of function was at or above the threshold.

How did I manage get my calculation of FI right? Was it a miracle? And Rumraket agreed with me. Another miracle?

You’ll need to explain how I went wrong when I calculated FI for an actual example to convince me that I don’t understand FI.
newton on July 3, 2018 at 9:02 pm said:

colewd:
newton,

I am pointing to a mechanism that we know from analogy that can create FI.

Ok, just sounded like you were smuggling in a mechanistic explanation for ID.
newton on July 3, 2018 at 9:04 pm said:

Mung: You’ll need to explain how I went wrong when I calculated FI for an actual example to convince me that I don’t understand FI.

Would you be able to understand the example?
Rumraket on July 3, 2018 at 9:16 pm said:

Mung: When calculating FI you have to consider all possible sequences

Sure, you take the fraction of all possible sequences that meet the minimum threshold of function. That’s n in the equation I wrote.

It’s -log2(n/a^L)

a^L is all possible sequences.
n is the total number of sequences that meet the minimum function.

It’s super simple, Hazen & Szostak gives a simple straightforward example that shows this in their paper which I quote here:

“Functional information is determined by identifying the fraction of all sequences that achieve a specified outcome.

Consider, for example, sequences of 10 letters that have a high probability (Ex ≅ 1) of evoking a positive response from the fire department. Such sequences might include “FIREONMAIN,” “MAINSTFIRE,” or “MAPLENMAIN.” Additionally, some messages containing phonetic misspellings (FYRE or MANE), mistakes in grammar or usage (FIREOFMAIN), or typing errors (MAZLE or NAPLE) may also yield a significant but lower probability of response (0 ≪ Ex ˂ 1). Given these variants, on the order of 1,000 combinations of 10 letters might initiate a rapid response to the approximate location of the fire. Thus
I(1) ≈ -log2[1000/(26^10)] ≈ 36 bits.”

Look at that last equation there. 1000 sequences meet the minimum threshold, so n=1000. The total sequence space for the english alphabet of sequences 10 letters long is a^L = 26^10.

If all possible sequences fail to meet the minimum threshold for function (which is the case that Joe Felsenstein defined), then it isn’t possible for mutation to bring a sequence up to meet it. After all, that sequence it changes into will be part of the set of all possible sequences.

On another note I want to learn how to use that LaTeX stuff so I can write equations properly. Anyone know a good tutorial for how to use it on this website?
Joe Felsenstein on July 3, 2018 at 9:22 pm said:

Mung: I gave Rumraket an example where I calculated FI. I didn’t need to calculate the FI for any specific sequence to do that. All I needed to know was whether its level of function was at or above the threshold.

Sure, once you have some threshold level of function, you can do that.

So for each sequence, you can find its level of function, and from that compute an FI for that sequence. Carothers, Oestreich, Davis, and Szostak (2004) do that for GTP-binding RNAs. They tested 11 naturally-occurring RNAs — for each they of course could not test all possible other sequences. They used a large number of mutated sequences made in vitro instead. They computed the FIs of the 11 sequences.

As for Mung’s example computation … alas, I haven’t found it yet. How about a link, or the date and time of the comment.
Corneel on July 3, 2018 at 9:23 pm said:

colewd: I agree with Mung that once we define a function the FI is already accounted for. Mutation cannot create FI it can only through serendipity stumble on to the FI that exists in sequence space for the specifically defined function.

It is unclear to me where exactly Mung’s train of thought goes off the rails, but you don’t want to follow his lead. If all FI already exists in sequence space and any change to a sequence is already accounted for, and this somehow prevents mutation from ever “creating” any FI (what the hell does all that even mean!?), then neither can the Designer ever create any FI by changing a sequence. I bet you didn’t see that one coming, did you?

colewd: Mutation cannot create FI it can only through serendipity stumble on to the FI that exists in sequence space for the specifically defined function.

Oh, and will you please never ever type anything like this again? A little less wool and more clarity, if you please.
keiths on July 3, 2018 at 9:30 pm said:

Corneel, to Bill:

Oh, and will you please never ever type anything like this again? A little less wool and more clarity, if you please.

Heh.

Clarity, whether of thought or expression, is not a Bill thing.
Rumraket on July 3, 2018 at 10:00 pm said:

colewd: Mutation cannot create FI it can only through serendipity stumble on to the FI that exists in sequence space for the specifically defined function.

Then the same would be true for design. Either way, if you say the FI is inherent to sequence space itself for the function in question, then creating a functional sequence does not create FI whether it is by design or by evolution.

You could then claim that sequence space itself was designed (which is basically what Dembski et al do), but then you can’t say that evolution can’t navigate that sequence space merely from the fact that some particular ensemble of sequences exhibit 500 bits or more of FI.

You keep making all sorts of strange ad-hoc rules up for why evolution magically doesn’t count, and they keep coming back and biting you in the butt.
Joe Felsenstein on July 3, 2018 at 10:22 pm said:

Rumraket: On another note I want to learn how to use that LaTeX stuff so I can write equations properly. Anyone know a good tutorial for how to use it on this website?

Mostly, just enclose the LaTeX expressions in a pair of dollar signs. One “hidden gotcha”: when you Edit your post, and any expression that starts out with a single backslash in it will have the backslash removed when the edit is Saved. So you need to make the single backslash into two blackslashes before the Save. And do that again in each Edit session.
DNA_Jock on July 3, 2018 at 11:10 pm said:

Mung: I gave Rumraket an example where I calculated FI. I didn’t need to calculate the FI for any specific sequence to do that. All I needed to know was whether its level of function was at or above the threshold.

How did I manage get my calculation of FI right? Was it a miracle? And Rumraket agreed with me. Another miracle?

You’ll need to explain how I went wrong when I calculated FI for an actual example to convince me that I don’t understand FI.

You arrived at a number that was correct, while mangling the underlying concepts badly. And, along the way, you did calculate the FI for two specific sequences. Furthermore you failed to notice what the FI was for the remaining 254 sequences, perhaps because it was zero.
I will explain.
Your wonderfully non-biological setup was as follows.
N = 8, alphabet = 0 and 1.
256 possible sequences.
You then wrote

Minimum degree of function is all characters in the sequence are the same

This conflates two separate issues: the definition of “function” and a threshold level of activity of that function. Your goal was to have only 00000000 and 11111111 qualify as ‘above threshold’. For rhetorical reasons, you defined a dichotomous function, and set the threshold to “True”. Note that you could have defined the function as “frequency of the most prevalent character” and set your threshold to 8. Either version yields the same result in terms of the FI for 00000000 and 11111111, each sequence has an FI of 7 bits ( – log2(2/256)). Cool. Note, however that the concept of “minimum degree of function” has nothing to do with FI whatsoever, it is a concept introduced during arguments about islands of function and what’s selectable.
Also note that, for your “dichotomous” function, the other 254 sequences have FI = 0. For the non-dichotomous function, there are 16 sequences with an activity level of 7, thus the FI of 00010000 would be log2((2+16)/256) = 3.83 bits, 56 sequences with a score of 6 and an FI of 1.79 bits, 112 sequences with a score of 5 and FI of 0.46 bits, and finally 70 sequences (e.g. 11110000) with a score of 4 and an FI of zero.

FI is a function of activity level, given the context of a sequence space and a specification of “function”. It’s merely the log of the percentrank.
Your insistence on dichotomous activity levels represents a failure to comprehend. Granted, the threshold inequality is dichotomous, but the activity levels it acts upon are not necessarily. Insisting that they too are dichotomous reduces FI to something rather trivial.
I’m sure that wasn’t your intent.
DNA_Jock on July 3, 2018 at 11:34 pm said:

Mung: In my opinion, being dealt a hand of poker, being able to toss away some cards you don’t like and drawing replacement cards that may or may not improve your hand is bound to be a flawed analogy for how to calculate FI.

In my opinion, being dealt a hand, discarding the cards that don’t help and replacing them with random cards is an absolutely awesome analogy for how FI can change, especially if you get to draw multiple times. I understand precisely why you disagree. 😉
Joe Felsenstein on July 3, 2018 at 11:36 pm said:

OK, thanks DNA_Jock, and especially for including the link. So if two of the 8-bit-long sequences have function X (which is greater than 0) and the rest have function 0, those two have FI of $-\log_2(2/256) = 7$ bits. And the rest have FI of 0 bits.

So in that example an FI can be calculated for each possible sequence.
Rumraket on July 4, 2018 at 12:40 am said:

Joe Felsenstein: Mostly, just enclose the LaTeX expressions in a pair of dollar signs. One “hidden gotcha”: when you Edit your post, and any expression that starts out with a single backslash in it will have the backslash removed when the edit is Saved. So you need to make the single backslash into two blackslashes before the Save. And do that again in each Edit session.

Thanks, seems to work.
colewd on July 4, 2018 at 4:01 am said:

Corneel,

prevents mutation from ever “creating” any FI (what the hell does all that even mean!?),

If the system is in place the FI already exists for that protein. If it is the last gene generated for that system the system determines the functional sequences not the gene itself. The required sequences are pre determined.

If the search space is very small and the mutation rate is very high then a random search can work. An example is the adaptive immune system. Still in this case the functional sequence is pre determined by the pathogen it needs to bind to.

FI is not created step by step but is part of an overall system design. In a written story a single paragraph means very little on its own but it is an interdependent part of the overall story. A U3 protein is a tiny piece of a very large complex system. Here a single protein, and its sequence, can be part of a system that requires hundreds of proteins.

I believe that this is the message that Mung has been trying to get across regarding measuring functional information. It’s not always a straight forward process and the function of a individual protein cannot be measured independently of the system it is a part of.
keiths on July 4, 2018 at 5:59 am said:

Bill,

If the search space is very small and the mutation rate is very high then a random search can work.

As has been explained to you hundreds of times, evolution does not proceed by random search.

What is your stumbling block here? Are you truly incapable of grasping this, or is it a manifestation of the Jebus Effect?
Alan Fox on July 4, 2018 at 7:25 am said:

colewd:
Alan Fox,

Your faith is to be admired

Are you claiming that energy mass production came before the first replicators?

There were energy sources prior to life appearing on Earth so that would seem a reasonable proposition.

Gpuccio has supported it[islands of function in a sea of non-function] well with his empirical evidence including a number of highly preserved sequences over deep time.We see lots of sequences at local optimums where many AA substitutions lead to purifying selection.

All I have seen are assertions. What evidence?

You are asserting that the tiny islands don’t ever exist.Any support here or should we duly note your assertion?

*Chuckles*

Have you no shame, Bill! 🙂

Gpuccio gets a free pass on the assertion that function is rare and I have to support the assertion it is not?

OK

For starters:

Keefe and Szostak (PDF)

Now your turn – some evidence that functional proteins are rare in sequence space.
Alan Fox on July 4, 2018 at 7:34 am said:

I’m also no longer wondering about non-functional DNA. Another prediction of mine bites the dust :- “Presumably, by the same token, Bill and Guiseppe think non-functional DNA really is all junk.”

Nope! Gpuccio has great faith that there is much function to be discovered in non-coding DNA.
Rumraket on July 4, 2018 at 9:19 am said:

colewd: If the system is in place the FI already exists for that protein.

Uhh right, if the protein with the function of interest already exists, then the FI for that protein already exists. Trivial.

If it is the last gene generated for that system the system determines the functional sequences not the gene itself. The required sequences are pre determined.

What does this even mean? You are blathering as usual.

If the search space is very small and the mutation rate is very high then a random search can work.

The high mutation rate in the case of mutating antibodies is required not because functional “solutions” couldn’t emerge without it, but because they have to happen fast enough to get rid of a potentially infectious foreign agent before it spreads throughout the body.

An example is the adaptive immune system. Still in this case the functional sequence is pre determined by the pathogen it needs to bind to.

Any functional sequence is determined by it’s function in it’s environment. Is it water or fat soluble? How much salt is present? What is the temperature range in which it works? Does it bind a substrate and if so what is the size and shape of that substrate? All these factors and more influence any given sequence for a functional biological polymer. That doesn’t mean the FI for a given yet-to-exist function somehow exists out there already in some platonic realm. That’s absolutely ridiculous and nonsensical. Molecules don’t have “FI souls” floating around in the ether waiting to associate with their atoms.

FI is a human creation. It is us who decides to take $-\log_2$ of the fraction of all sequences that meet some threshold for function. This equation doesn’t somehow infuse sequence space, or protein or nucleotide polymers, or “pre exist” in some nebulous immaterial realm for yet-to-physically-exist functions.

FI is not created step by step but is part of an overall system design.

You are just blathering and saying random shit again Bill.

In a written story a single paragraph means very little on its own but it is an interdependent part of the overall story. A U3 protein is a tiny piece of a very large complex system. Here a single protein, and its sequence, can be part of a system that requires hundreds of proteins.

And yet you have been completely unable to rule out that such a system of interdepence could have emerged gradually one losely associating component at a time.

I believe that this is the message that Mung has been trying to get across regarding measuring functional information.

I don’t believe that is the message Mung has been trying to get across regarding measuring functional information. At all.

It’s not always a straight forward process and the function of a individual protein cannot be measured independently of the system it is a part of.

Ahh so now you can’t measure the FI for the beta subunit of F-type ATP synthases. And Mung has shown we can’t sum FI for individual components to arrive at a total FI for a whole, so I guess that’s all of your and Gpuccio’s FI-gibberish out the door since the calculation can’t even be done nor would it give a meaningful result according to you.

Look Bill, you’re honestly just making stuff up and completely out of your depth here. I’m sorry but there is no other way of putting it and I’m tired of reading you just blurting out incoherent nonsense that you make up as you type out your posts. And it’s sad to see how the people who’s output you trust and rely on so much
(“Mung, what do you think?“)
don’t make any effort to correct your endless misunderstandings. The most charitable interpretation I can give that is that they too have realized that probably couldn’t be done.

What’s worse is what it says about the kind of thought-processes that must operate in your head when you write your posts. Apparently you feel that you can basically just type out anything and expect that someone on “your side” will come to your aid and agree with you. That to you it is okay to just make stuff up because you feel that you can expect someone else to have your back no matter how wrong or incoherent you are. You would agree with any made up nonsense Gpuccio could come up with, and you seem to expect the same thing back. You don’t have to make sense, you just have to keep arguing and keep up the appearance that a genuine debate is taking place.

It is as if you have this need to keep posting. “omg I better say something back” and you are wholly unconcerned with whether it is true or even whether it makes sense. If in doubt about what to say, just make something up. It doesn’t matter what, or even that it completely undermines and contradicts something you wrote earlier. As long as you can stick your new favorite technical terms and abbreviations in there it’s got the appearance of being scientific, and that’s all it needs, appearance. The less coherent and more clueless version of Salvador Cordova.
Corneel on July 4, 2018 at 9:37 am said:

colewd: I believe that this is the message that Mung has been trying to get across regarding measuring functional information. It’s not always a straight forward process and the function of a individual protein cannot be measured independently of the system it is a part of.

You are reading too much of your own interpretation in Mung’s words. For one thing, functional information is just a measure of system complexity, so I actually agree with Mung that it cannot be literally “created” or “stumbled upon”. What gets created is a system configuration that is capable of fulfilling some specified function (or not). When we say that FI is “created”, “introduced” or “increased” we are just using shorthand phrases to express that.

A configuration is not complete until all the little jigsaw pieces are in place, and that means that the FI associated with it is not “already there” when we still need to add a protein/AA/gene/whatever. To spell it out clearly: even when only a single specific element is required for completing a configuration with the maximal possible degree of function (Emax), if the present configuration doesn’t work yet, its FI is zero. Tough beans. Only when you complete the functional configuration you reach the Emax and calculate the associated FI. That quantity counts for the whole configuration. Smuggling in bits and pieces in “transitions” like gpuccio and you do, is not allowed. When I confronted Mung with that, he did not disagree.

Mung’s confusion runs way deeper than that. He is telling us that, because the degree of function of every conceivable sequence is already implicitly present in the calculation of FI, changing a sequence never gains or loses us any FI. If Mung is correct, that implies that when the Designer creates a functional sequence by modifying some genomic sequence, this does not affect the FI of that sequence. Imagine my surprise when you actually agreed with that 🙂
Corneel on July 4, 2018 at 10:20 am said:

Alan Fox: I’m also no longer wondering about non-functional DNA. Another prediction of mine bites the dust :- “Presumably, by the same token, Bill and Guiseppe think non-functional DNA really is all junk.”

Nope! Gpuccio has great faith that there is much function to be discovered in non-coding DNA.

To repeat a rebuke that I myself once received*: Do not confuse non-coding with non-functional. All non-functional DNA is junk by definition, but non-coding DNA can be functional.

I suspect that gpuccio will dress up his claims by saying that, although function is as rare in DNA sequence space as it is in protein sequence space, all the human sequences are clearly functional.

Hey, what are the odds?!? The FI of the human genome must be HUGE!

ETA: * Yes, it was John.
colewd on July 4, 2018 at 3:50 pm said:

Alan Fox,

There were energy sources prior to life appearing on Earth so that would seem a reasonable proposition.

What do you think the first molecular energy source looked like?

All I have seen are assertions. What evidence?

Lets start with the evidence you cited. This island that only binds one molecule sitting on the earth has a diameter of about 3 ft which is not enough room for you and me to share a bottle of French wine 🙂
colewd on July 4, 2018 at 4:10 pm said:

Rumraket,

Uhh right, if the protein with the function of interest already exists, then the FI for that protein already exists. Trivial.

If the system is in place not the protein. Example: All components of ATP synthase are in place but the beta chain. The sequence must form a protein that will finish the structure and allow it to function.

Any functional sequence is determined by it’s function in it’s environment.

A point of agreement 🙂

And yet you have been completely unable to rule out that such a system of interdepence could have emerged gradually one losely associating component at a time.

I have not been able to rule it out but the evidence has 🙂

Ahh so now you can’t measure the FI for the beta subunit of F-type ATP synthases.

Sure we can. We are measuring when the specific sequence causes system failure.
colewd on July 4, 2018 at 4:24 pm said:

Corneel,

You are reading too much of your own interpretation in Mung’s words. For one thing, functional information is just a measure of system complexity, so I actually agree with Mung that it cannot be literally “created” or “stumbled upon”. What gets created is a system configuration that is capable of fulfilling some specified function (or not). When we say that FI is “created ”, “introduced” or “increased” we are just using shorthand phrases to express that.

We mostly agree here. What gpuccio is measuring is when change to the FI causes system failure. The more often this happens the higher the FI for that system component. An indirect measurement but the only one on the table at this point.

If Mung is correct, that implies that when the Designer creates a functional sequence by modifying some genomic sequence, this does not affect the FI of that sequence. Imagine my surprise when you actually agreed with that

So thats probably not the design strategy 🙂
colewd on July 4, 2018 at 4:47 pm said:

keiths,

As has been explained to you hundreds of times, evolution does not proceed by random search.

Think it possible that you may be mistaken?
Mung on July 4, 2018 at 4:50 pm said:

keiths: Clarity, whether of thought or expression, is not a Bill thing.

Charity, whether of thought or expression, is not a keiths thing.
Mung on July 4, 2018 at 4:53 pm said:

Corneel: The FI of the human genome must be HUGE!

What is the function of the human genome and how do we measure it?
Mung on July 4, 2018 at 5:10 pm said:

Corneel: For one thing, functional information is just a measure of system complexity, so I actually agree with Mung that …

Nice to be able to agree on something. 🙂

Only when you complete the functional configuration you reach the Emax and calculate the associated FI. That quantity counts for the whole configuration.

And what is the point then of calculating the FI for each individual sequence? It’s utterly superfluous. Can we also agree on that?

Mung’s confusion runs way deeper than that. He is telling us that, because the degree of function of every conceivable sequence is already implicitly present in the calculation of FI, changing a sequence never gains or loses us any FI.

It’s not entirely clear where you think my confusion lies.

Do you think that it’s in my claim that the sequences in “n” are a subset of the sequences in “a^L”? So unless your “mutation” adds a new letter to the existing alphabet or introduces a longer or shorter sequence, that sequence is already present in the denominator.

The only decision is whether to include it in the numerator as well, and that decision is based on it’s degree of function and whether it meets or exceeds the minimum threshold.

This really is quite simple and straight-forward and it is a mystery to me why so many people are confused about it.
Mung on July 4, 2018 at 5:18 pm said:

keiths: As has been explained to you hundreds of times, evolution does not proceed by random search.

Evolution proceeds by design.
Mung on July 4, 2018 at 5:35 pm said:

Joe Felsenstein: So for each sequence, you can find its level of function, and from that compute an FI for that sequence.

What is the point of calculating FI for each individual sequence?

Joe Felsenstein: So in that example an FI can be calculated for each possible sequence.

So? What’s the point of doing that?

So if two of the 8-bit-long sequences have function X (which is greater than 0) and the rest have function 0

No Joe. Just because they fail to meet the minimum threshold it does not follow that they have no degree of function at all.

And the rest have FI of 0 bits.

Please show your math.
Mung on July 4, 2018 at 5:54 pm said:

DNA_Jock: You arrived at a number that was correct, while mangling the underlying concepts badly.

So a miracle then. 🙂

I must have understand something about FI to pull that off. What do you suppose it is that I actually do understand about FI?

And, along the way, you did calculate the FI for two specific sequences.

No, I calculated the FI for the system.. Of what relevance is the FI of each individual sequence? I see none.

Furthermore you failed to notice what the FI was for the remaining 254 sequences, perhaps because it was zero.

I “failed to notice” it because it’s irrelevant to the calculation of the FI of the system.

Please show your math for how you arrive at an FI of 0 for the remaining sequences.

more later. nice post, btw.
OMagain on July 4, 2018 at 6:32 pm said:

Mung: No Joe. Just because they fail to meet the minimum threshold it does not follow that they have no degree of function at all.

What function do they have then?
OMagain on July 4, 2018 at 7:21 pm said:

Mung: What is the function of the human genome and how do we measure it?

According to FMM it is to worship/praise a deity. I forget the exact phrase, I think it was a quote however….

How we measure it, you’ll have to ask him I’m afraid.

But presumably we’re evolving towards peak prayer, right?
dazz on July 4, 2018 at 7:31 pm said:

Mung: What is the point of…

I wonder what is the point in arguing with knuckleheads like you and especially Billy.

Your interpretation of FI is fucking stupid.
And you really think you’re in a position to lecture Joe about it? heh
dazz on July 4, 2018 at 7:33 pm said:

Mung: Evolution proceeds by design.

in your dreams
Mung on July 4, 2018 at 8:24 pm said:

dazz: Your interpretation of FI is fucking stupid.

As if you even know what my “interpretation” of FI is. LoL.
Rumraket on July 4, 2018 at 9:27 pm said:

colewd:
Rumraket: And yet you have been completely unable to rule out that such a system of interdepence could have emerged gradually one losely associating component at a time.

colewd: I have not been able to rule it out but the evidence has

What evidence? I have asked you several times how you know the shape of the fitness landscape, or from what spot in sequence space extant sequences evolved, and you have just blathered. The fact is you don’t know that and the only kind of evidence we have is from comparative genetics, which indicates the diametrically opposite of what you claim.

Rumraket: Ahh so now you can’t measure the FI for the beta subunit of F-type ATP synthases.

colewd: Sure we can. We are measuring when the specific sequence causes system failure.

What the hell are you blathering about now? This is just another one in an incredibly long line of you just talking out of your arse.

Bill, That’s not how you calculate FI. There is no part of Hazen & Szostak’s method that has anything to do with “measuring when the specific sequences causes system failure”. What does that even mean? Please show me where in the equation you stick that information. What part of the equation contains “when the specific sequences causes system failure”?
J-Mac on July 4, 2018 at 9:33 pm said:

Can someone please tell me what is so appealing to both side of the issue about the Grupcio 500?
I’m not a mathematician, but I know where to draw the line …of nonsense and reality…
Rumraket on July 4, 2018 at 9:33 pm said:

colewd: What gpuccio is measuring is when change to the FI causes system failure.

Where the hell has he measured this, and what does that even say about a possible ancestral state from which extant systems could have evolved?

Bill, again. How do you know which one of these hills the sequences we see today sit on?

And more importantly, how do you know that the extant state of any given system did not evolve from the base of a hill like C or D? You can’t know that simply by looking at the degree of conservation. All that tells you is that most changes have lower fitness, but that doesn’t tell you that those states of lower comparative fitness could not have functioned and been favored against a background of even lower average fitness for the population.
colewd on July 4, 2018 at 11:11 pm said:

Rumraket,

What does that even mean? Please show me where in the equation you stick that information.

The denominator you already know the numerator is the non preserved AAs over deep time.
dazz on July 4, 2018 at 11:13 pm said:

Mung: As if you even know what my “interpretation” of FI is. LoL.

I do, and it’s bullshit. You clearly misunderstand what the authors mean when they say that FI is dependent on the ensemble, drawing from that that it doesn’t make sense to calculate FI for (the level of function of) some particular sequence.

You’ve been told why you’re wrong countless times, to no avail.
colewd on July 4, 2018 at 11:20 pm said:

dazz,

I do

and it’s bullshit

Is not an argument.
J-Mac on July 4, 2018 at 11:29 pm said:

colewd: dazz,

I do

and it’s bullshit

Is not an argument.

And BS continues… as it should on the BS OP…
It boggles my mind but, hey! i’m different…I guess…
colewd on July 4, 2018 at 11:32 pm said:

Rumraket,

And more importantly, how do you know that the extant state of any given system did not evolve from the base of a hill like C or D? You can’t know that simply by looking at the degree of conservation.

The large quantity of conserved sequences plus the family members with different sequences that are also preserved is giving us a pretty big clue that we are looking at a small pimple surrounded by a very large ocean. Especially where we see 90% preservation over 100 million years and short generation times.
colewd on July 4, 2018 at 11:35 pm said:

J-Mac,

And BS continues… as it should on the BS OP…
It boggles my mind but, hey! i’m different…I guess…

If you enjoy having your mind boggled its all good 🙂
Joe Felsenstein on July 5, 2018 at 3:10 am said:

Mung:

Joe Felsenstein: So for each sequence, you can find its level of function, and from that compute an FI for that sequence.

What is the point of calculating FI for each individual sequence?

We may be interested in how much information is present in naturally occurring molecular sequences, as compared to random sequences from the set of all possible sequences.

Joe Felsenstein: So in that example an FI can be calculated for each possible sequence.

So? What’s the point of doing that?

To get the idea of how far out into the tail of the distribution of function of all possible sequences are the ones that are actually seen in a species.

[me:]

So if two of the 8-bit-long sequences have function X (which is greater than 0) and the rest have function 0

No Joe. Just because they fail to meet the minimum threshold it does not follow that they have no degree of function at all.

True enough. I was supposing. So let me simply suppose instead that
the rest have, not zero level of function, but function $Y$ , where $Y < X$ .
Then it is still true that, as I said:

And the rest have FI of 0 bits.

Please show your math.

Gladly. I thought I had made all that clear but here is the math:

(a) 2 sequences have level $X$ of function.
(b) the remaining 254 of them have function $Y < X$ .
(c) To compute the FI for the two, we use the formula $-\log_2(n / 256)$ , which, since $n = 2$ , is 7 bits.
(d) To compute the FI for the rest, we note that $254+2 = 256$ have function greater than or equal to $Y$ (i.e., all of them do). Then, applying the above formula with $n = 256$ we get $-\log_2( 256/256) = 0$ bits.
(e) … and that is true whether $Y = 0$ or $Y$ is any other value that is less than $X$ .

That’s my calculation. May I assume that yours is the same as part (c), except that you call it the FI for the “system” instead of for those two sequences? If so, what does that gain you? Or do you have some other computation?
Alan Fox on July 5, 2018 at 8:06 am said:

colewd,

Thanks for the correction. Haste makes sloppy comments! 🙁
Alan Fox on July 5, 2018 at 8:09 am said:

colewd: What do you think the first molecular energy source looked like?

I’m guessing sulphur or iron as an electron donor.
Alan Fox on July 5, 2018 at 8:17 am said:

colewd: Lets start with the evidence you cited. This island that only binds one molecule sitting on the earth has a diameter of about 3 ft which is not enough room for you and me to share a bottle of French wine

See, this is the error (yours and gpuccio’s) writ large. Szostak looked for ATP binding. He didn’t look for any other functionality. Yet you blithely assume that the proteins he found with ATP binding activity have only that useful property.

One day it might be possible to predict the potential properties of a theoretical protein sequence just from knowing the sequence. Until then, the only way to demonstrate functionality is to test real examples, as Szostak did. But to test for a broad spectrum of functionality over a large sample of all possible protein sequences is a monumental task. But Szostak’s result hints at widespread activity.

Your turn!
Corneel on July 5, 2018 at 8:35 am said:

colewd: So thats probably not the design strategy 🙂

According to gpuccio, that is exactly how the design strategy works. If you remember, he tried to demonstrate that there were large increases in the level of FI (“information jumps”) in pre-existing protein sequences, most of them at the base of the vertebrate lineage. Will you tell him that he is totally wrong about that, because sequence changes cannot possibly affect FI, and all the FI is “already there”? If not, will you finally retract your statement that mutations cannot “generate” FI of any amount?
Corneel on July 5, 2018 at 8:36 am said:

Alan Fox: colewd,

Thanks for the correction. Haste makes sloppy comments! 🙁

Heh, so I see 🙂