Evolution and Probability

Probabilistic thinking is pervasive in evolutionary theory. It’s not a bad thing, just something that needs to be acknowledged and appropriately handled.

Denial

Some go so far as to deny it, but in my experience these people are ideologues. These are critics of ID who complain about the lack of any numbers being attached to the probability arguments of ID proponents, and their denial is perhaps rooted in their fear of a tu quoque.

Where are their own probability calculations?

Incredulity

Another reason for their denial could be that they also love to accuse ID proponents of making arguments from incredulity, while being unwilling to face up to the fact that they are guilty of the same thing. Does evolutionary theory depend on arguments from incredulity? Almost certainly.

Take for example the idea that all extant life shares a common ancestor. It is based upon the idea that it is simply too implausible that life should arise more than once and yet share common features such as the genetic code.

We can be very sure there really is a single concestor of all surviving life forms on this planet. The evidence is that all that have ever been examined share (exactly in most cases, almost exactly in the rest) the same genetic code; and the genetic code is too detailed, in arbitrary aspects of its complexity, to have been invented twice.

Dawkins, Richard. The Ancestor’s Tale: A Pilgrimage to the Dawn of Evolution


An argument from incredulity.

Probabilities are Important

The importance of probablity in evolutionary thinking might best be seen in the following text:

If there are versions of the evolution theory that deny slow gradualism, and deny the central role of natural selection, they may be true in particular cases. But they cannot be the whole truth, for they deny the very heart of the evolution theory, which gives it the power to dissolve astronomical improbabilities and explain prodigies of apparent miracle.

Dawkins, Richard. The Blind Watchmaker: Why the Evidence of Evolution Reveals a Universe without Design

Independence of Events

While having said all this, I’d like to focus on the idea that evolutionary events are not independent and that this somehow rescues evolutionary theory from being guilty of appealing to vastly improbable outcomes, aka miracles.

Consider a toss of the dice in a game of craps. The odds of double six is 1/36. Sure, we can roll a single die twice, and the odds of a six on each roll is now only 1/6 vastly more likely to occur by chance (not really). 1/6 x 1/6 is still 1/36. The probabilities are multiplicative because the events are independent events. The fact that if you have two dice and you roll the first die until you get a six and then you keep that (by cumulative selection) and then roll the second die until you get the second six and now you have two sixes doesn’t change the probabilities one whit. Doesn’t that demonstrate that cumulative selection is helpless in reducing probabilities?

Well, you might say, you need to roll BOTH dice until only one of them shows a six and then keep that one and then roll the second die. But what in evolution is analogous to that?

Sure, if you roll two dice trying to roll a six you have a better chance of a six showing on one of the two dice than if you roll just one. It’s like rolling one die twice in an attempt to get a six rather than just once. Of course the probability of the second six would still be 1/6. But why aren’t justified in adding a third die after our first six is rolled so that once again we are trying to get a six from two dice and not just one? And doesn’t this again demonstrate that it is not cumulative selection at all that is responsible for the reduction in probability but rather the number of trials we allot each attempt to roll a six?

Closing

The fundamental question is why aren’t evolutionary events independent and thus multiplicative?

The secondary question is what is the true role of cumulative selection in reduction from the miraculous to the mere appearance of the miraculous?

291 thoughts on “Evolution and Probability

  1. Allan Miller: Here is a paper by Lenski discussing and evaluating experimental methods in an arena where fitness is comparatively easy to measure, when compared to long-lived organisms – in bacteria. Now, if I read phoodoo correctly, it should not even be possible to mix bacterial strains and evaluate which has the higher rate of increase. That there is, in fact, no such thing as a differential in rate of increase, in which case everything must have the same rate of increase, which seems a bit unlikely. He seems to claim support from philosophers on this, so scientists better tread warily!

    Guess what they are measuring, anyone?

    Offspring. Anyone guess correct. Because that is what fitness is a measure of.

    The concept of fitness is central to evolutionary biology. Genotypes with higher fitness will tend to produce more offspring and thereby increase in frequency over time compared to their less-fit competitors. Fitness, however, is often difficult to measure, especially for long-lived organisms. Unlike traits such as color, fitness cannot be observed at a single point in time, but instead it must be measured and integrated across the lifespan of the individuals. Thus, researchers typically measure fitness components—such as the number of seeds produced or offspring fledged—and use them as proxies for fitness.

    Proxies for fitness, right. So what’s better than a proxy for fitness? Well, let’s see what they have to say about that:

    One of the most commonly employed method of quantifying microbial fitness is to calculate the maximum growth rate (Vmax) of a culture growing on its own [6–10], usually by measuring the optical density of the culture over time. These measurements have the advantages of being simple and fast; a spectrophotometer can measure many samples in a multi-well plate in quick succession, and systems can be programmed to take measurements over the full growth cycle of a culture. However, maximum growth rate is typically only one component of fitness even in the simplest systems [11], and hence it provides, at best, only a proxy for fitness.

    Again, only a proxy for fitness. Hmm, interesting.

    Well, how are we going to get better than a proxy? Once again:

    another approach for quantifying fitness in microbial systems—and the approach that most closely corresponds to the meaning of fitness in evolutionary theory—uses a competition assay. The basic approach is to compete one strain or population against another and directly measure their relative contributions to future generations. This approach typically produces a measure of relative, rather than absolute, fitness. Relative fitness is more important than absolute fitness when considering the evolutionary fate of a particular genotype, provided that absolute fitness is high enough to prevent extinction of the entire population [14,15]. Competitive fitness assays, by measuring the net growth of two different populations, incorporate and integrate differences across the full culture cycle, which may include such fitness components as lag times, exponential growth rates, and stationary phase dynamics in batch culture

    Wait, I think I get it! The best approach is to

    is to compete one strain or population against another and directly measure their relative contributions to future generations.

    Wow, really!!!??? Who would have guessed?????

    I think I know what the best approach is now. Measure offspring. Unless you want a proxy for fitness that is.

    Words, if only they had meaning, and people could read them.

  2. Mung:

    Are there truly no independent events that occur in evolution? So are we never justified in multiplying the probability of two events?

    One example is Prob(“specific mutation”) and Prob(“that specific mutation is beneficial to organism in the current environment”). I think they are independent, at least to a first approximation.

    At least, that is the way I (a non-biologist!) think of “random” mutations. As I said, this is just a first approximation, but I think it is a good starting point to understanding what “random” means when used in that context of “random mutation”..

  3. phoodoo: That you think you can measure something, that can’t even be properly defined, seems pretty silly to me.

    And yet it’s being done in practice for every imaginable evolving entity every single day. Here’s a graph from the paper Allan linked where fitness was measured by different methods. Looks like it was defined and measured. That you think this somehow didn’t take place looks pretty silly to me:

  4. phoodoo: Wait, I think I get it! The best approach is to

    is to compete one strain or population against another and directly measure their relative contributions to future generations.

    Wow, really!!!??? Who would have guessed?????

    So they measured it. Now what?

  5. phoodoo:
    Rumraket,

    Try reading Rumraket.

    I did. Here’s what they say:

    Fitness measurements
    We quantify fitness in this system as the ratio of the realized growth rates of two populations while they compete for resources in the same flask and under the same environmental conditions used in the LTEE.

    And:

    We employ three different methods for measuring fitness in this study. For all three methods, we begin by removing aliquots of the competitors from the vials in which they are stored at −80C into separate flasks containing Luria-Bertani (LB) broth. The cultures grow overnight at 37C and reach stationary phase. We then dilute each culture 100-fold into 0.86% (w/v) saline solution and transfer 100 μL into a flask containing 9.9 mL of DM25. These cultures grow for 24 h under the same conditions as the LTEE, so that all competitors are acclimated to this environment. We then jointly inoculate 100 μL in total of the Ara-1 population sample and the Ara+ competitor into 9.9 mL of DM25. We immediately take an initial 100-μ1L sample of this mixture, dilute it in saline solution, and spread the cells onto a TA plate. The competition mixture is then incubated in the same conditions as the LTEE for 24 h, at which point we take a final 100-μL sample, dilute it, and spread the cells onto a TA plate. We count each competitor on the TA plates, and multiply the numbers by the appropriate dilution factor to determine their initial and final population sizes.

    They then go on to describe how they measure it for the three different methods. Somehow, mysteriously, they succeed and produce actual measurements. You think this didn’t transpire or what? They somehow discuss different fitness measuring concepts and then proceed to… not measure it anyway?

    Try reading phoodoo. Holy shit.

  6. Rumraket,

    We count each competitor on the TA plates, and multiply the numbers by the appropriate dilution factor to determine their initial and final population sizes.

    Then try comprehending.

  7. It’s funny, they’re claiming to have observed and measured something phoodoo insists can neither be defined or measured. And here I sit and can read what they do to measure it with my own eyes, and it involves work of a kind I’ve done myself.

    I just can’t put my finger on why I don’t find phoodoo’s comical flailing compelling.

  8. phoodoo,

    Wow, really!!!??? Who would have guessed?????

    Not you, evidently.

    So your criticism of fitness is now that it is trivially true, obvious even? That if two strains have differential growth rates, one will outcompete the other? Hmmmm. That was worth about 10 threads’ worth of excitable gibbering, I’m sure.

  9. Allan Miller:
    phoodoo,

    Not you, evidently.

    So your criticism of fitness is now that it is trivially true, obvious even? That if two strains have differential growth rates, one will outcompete the other? Hmmmm. That was worth about 10 threads’ worth of excitable gibbering, I’m sure.

    So we OBVIOUSLY have to just compete two population to measure fitness phoodoo agrees. But it can’t be done, not even defined.

  10. Allan Miller,

    Yes, you can measure offspring. But if what you are doing when you try to measure fitness is measure offspring, then call it measuring offspring, don’t call it measuring fitness.

    Which part of this did you not understand Allan? I know why Rumraket didn’t get it, because it contained slightly more than two words, but you too?

  11. phoodoo:
    Allan Miller,

    Which part of this did you not understand Allan?I know why Rumraket didn’t get it, because it contained slightly more than two words, but you too?

    But what if fitness is defined as the capacity to produce offspring? Oh wait, it is.

  12. BruceS:
    I have no idea what biochemistry would say about the independence of two mutations related to the amino acid you name.

    Things tend to be less ‘all-or-nothing’ than the example supposes. My favourite example is the amphipathic alpha helix. Alpha helixes form very readily, from an enormous range of possible acid sequences. But each acid has a different degree of hydrophobicity. The net affinity of a helix for cytosol or membrane depends on the distribution of these various residues – they give a helix a ‘hydrophobic moment’. Changes in the asymmetry of distribution cause a helix to stick more or less strongly to a membrane, which can be beneficial or not depending on the protein. So it’s the overall pattern that matters, not any chosen pair, and independent picks don’t really capture this interaction, nor the subtlety of fine tuning available.

  13. Allan Miller,

    Here’s a question for you Allan, see if you can follow along (Rumraket don’t bother, you will just give yourself a skull ache):

    What if we measure something other than offspring, and claim that it will tell us something about the fitness. Say we decide that the tallest horses are the most fit. So we see which horses are the tallest, and we also see which horses have the most offspring over time.

    Then it turns out that the tallest horses actually have the least offspring. Which conclusion do we reach:
    a) The tallest horses are not the most fit, so we change our minds about tall meaning fit.

    b) The number of offspring is not a good indicator of fitness, so we change the meaning of fitness to be that which is closest to the sun?

    If you said A, then maybe you will see the problem I am pointing out.

    If you said B, maybe you are Rumraket, or you have just eaten too much candy and you are going into a state of diabetic shock. Or both.

  14. phoodoo:
    Allan Miller,

    Which part of this did you not understand Allan?I know why Rumraket didn’t get it, because it contained slightly more than two words, but you too?

    I hadn’t even read it, to be honest, till now. I don’t hang on your every word. But since I clearly recall telling you in the past that fitness is typically represented by offspring numbers, a statement you took exception to, I don’t see what the fuss is about.

  15. Rumraket: But what if fitness is defined as the capacity to produce offspring?

    And how do we measure that? Oh yea, we count the offspring!!

    Go have a vacation Rumraket, they sell lots of candy at carnivals!

  16. phoodoo: Then you gave that link which was a synopsis of how and why we can’t really define fitness.

    We cannot really define anything.

    That you think you can measure something, that can’t even be properly defined, seems pretty silly to me.

    We do that all the time.

  17. Neil Rickert: That you think you can measure something, that can’t even be properly defined, seems pretty silly to me.

    We do that all the time.

    What do you mean by measure?

  18. phoodoo: What do you mean by measure?

    At it’s core, it is a systematic way of connecting the world of physical reality to the world of abstract symbols (as in language). However we do this, we are going to do it imperfectly.

    Your problem is that you want everything to be cut and dried and purely logical. But for that, you would have to be a solipsist. To connect with reality, you’ll need pragmatics.

  19. Over time, the fitness of a stable population doesn’t change.

    If a population, for whatever reason, is suddenly more fit, the population increases to the Malthusian limit.

    If fitness declines, the population goes extinct.

    Both cases happen all the time.

  20. petrushka: If a population, for whatever reason, is suddenly more fit, the population increases to the Malthusian limit.

    So like a population of more fit human females would be able to have sixteen babies rather than eight and be able to do that once every six months?

  21. Rumraket: Fitness cant’t be defined or measured. Meanwhile, in reality:

    Why are you posting articles on fitness and adaptation when Darwinism is dead in modern evolutionary science?

  22. Allan Miller: Things tend to be less ‘all-or-nothing’ than the example supposes.

    It’s an example. An illustration. Are we to deny that independent events exist at all in evolution?

    Is the reason we don’t see the probabilities involved in evolution actually calculated because they can’t be calculated?

  23. Allan Miller: So your criticism of fitness is now that it is trivially true, obvious even? That if two strains have differential growth rates, one will outcompete the other?

    If by growth rate you mean leave more offspring and by outcompete you mean leave more offspring. The strain that left the most offspring will have left the most offspring. Yes.

  24. Mung: It’s an example. An illustration. Are we to deny that independent events exist at all in evolution?

    No, I’m saying that amino acid changes in the same protein, giving an all or nothing binding energy wrt a given molecule, rarely occur in that simplistic fashion.

    Is the reason we don’t see the probabilities involved in evolution actually calculated because they can’t be calculated?

    Quite possibly.

  25. Mung: Why are you posting articles on fitness and adaptation when Darwinism is dead in modern evolutionary science?

    Define ‘Darwinism’.
    Hint: avoid equivocating between
    (1) what Darwin thought about natural selection and fitness, and

    (2) what modern biology considers in studying the relative role of natural selection compared to other mechanisms of populations genetics and in studying fitness as a parameter in various models used in modern biology.

  26. Mung: If by growth rate you mean leave more offspring and by outcompete you mean leave more offspring. The strain that left the most offspring will have left the most offspring. Yes.

    Indeed. I am perennially mystified why stating things tautologically is felt to defeat them though. The candidate with the most votes won the nomination. Who won the nomination? The candidate with the most votes.

    Therefore what? No-one won? Everyone did? Voting has no effect on the election of candidates?

  27. Mung: I’ve tried [referring to reading and Phoodoo, I believe].

    C’mon, now, he/she is far from the territory inhabited by the “everything is memory” poster. Though maybe in the KS territory when it comes to sarcasm as an argumentative tool.

  28. Mung: Why are you posting articles on fitness and adaptation when Darwinism is dead in modern evolutionary science?

    Says no-one in modern evolutionary science.

  29. It’s funny, until Creationists started taking an interest, no-one would have bat an eyelid at the notion that differential growth rates led towards competitive extinction.

  30. DNA_Jock,

    In a delightful goal-post shift, you are now trying to claim that, for a doubly functional protein, the frequency is 10^-11 times 10^-11, and thus there are only 10^82 needles in the haystack. You are, as ever, WRONG. There are considerably more than 10^82 double-functional needles.

    And your argument is?

  31. Allan Miller: Things tend to be less ‘all-or-nothing’ than the example supposes. My favourite example is the amphipathic alpha helix. Alpha helixes form very readily, from an enormous range of possible acid sequences. But each acid has a different degree of hydrophobicity. The net affinity of a helix for cytosol or membrane depends on the distribution of these various residues – they give a helix a ‘hydrophobic moment’. Changes in the asymmetry of distribution cause a helix to stick more or less strongly to a membrane, which can be beneficial or not depending on the protein. So it’s the overall pattern that matters, not any chosen pair, and independent picks don’t really capture this interaction, nor the subtlety of fine tuning available.

    Thanks Allan.
    So many details in biochemistry.
    Reminds me why I stick with (am stuck in?) philosophy.

  32. colewd,
    You asked a question:

    If according to empirical science (Szostak) a randomly generated 70AA protein binding to ATP is 10^-11 what is the probability it can bind to a second molecule assuming it has the same binding probability as ATP?

    I answered both the question you asked, and the question you meant to ask.
    I noted that there are 10^93 different ways for an 80mer to bind ATP. “That’s a lot of needles!”
    You claimed, erroneously

    Yet the haystack is 100 billion times larger 🙂 I know you see the problems w the theory so I won’t pound it in any more.

    I corrected you

    Actually the haystack has not changed size. You are arguing that the number of needles is only 10^93 / 10^11 = 10^82.

    You replied

    The haystack is 20^80 or 10^104. Haystack= total sequence space. That works out to 10^93 x 10^11.

    Which is correct. And what I had explained to you in February. And completely in agreement with everything I had written.
    So I had to explain to you what was the argument that you were trying to make.

    So to answer your “And your argument is?” question, I would say my argument is “colewd does not have an argument, and colewd appears innumerate.”

  33. DNA_Jock,

    You are, as ever, WRONG. There are considerably more than 10^82 double-functional needles.

    I will ask you again if you can back up this assertion which contradicted your pervious post that you were uncertain of the probability of the second binding molecule. You appear to be trying to make up new statistical laws
    🙂

  34. Allan Miller: Indeed. I am perennially mystified why stating things tautologically is felt to defeat them though. The candidate with the most votes won the nomination. Who won the nomination? The candidate with the most votes.

    Therefore what? No-one won? Everyone did? Voting has no effect on the election of candidates?

    We had this discussion with phoodoo almost exactly 1 year ago where I also tried a poker analoy: Defining Fitness.

    And a running analogy, and a driving analogy. It is a mystery that an otherwise cognitively normal person can’t get this.

  35. colewd: 10^82 double-functional needles.

    I will ask you again if you can back up this assertion which contradicted your pervious post that you were uncertain of the probability of the second binding molecule. You appear to be trying to make up new statistical laws

    Well, this is the first time you have asked me to back up my assertion, so your “again” is incorrect. I am not making up new statistical laws, merely trying to get IDists to understand conditional probabilities.
    Agreed, I am uncertain as to the probability of a bi-functional 80mer — I haven’t seen any data on that subject. I did put a range on it though. While I do some research on this, would you answer a question for me: of the 20^80 possible 80-mers, what proportion do YOU think will consistently produce a reasonably stable secondary structure? You can give a range if you like.

  36. DNA_Jock,

    Well, this is the first time you have asked me to back up my assertion, so your “again” is incorrect. I am not making up new statistical laws, merely trying to get IDists to understand conditional probabilities.

    You first need to establish that the events are conditional. As far as a secondary fold I have no data to support any swag at this point.

  37. You really do lack the courage of your convictions, Bill. Most IDists claim that fewer than one in a million random polypeptides form folds – this is after all the basis of the whole “no paths to higher function exist” assertion and gpuccio’s argument about all those distinct protein families. Personaly, I think it is a significantly larger fraction (say 1 in 100), but rather than get into an asserting contest with you, I am willing, arguendo, to accept whatever range you are comfortable with.
    Pick a number, or two.

  38. DNA_Jock,

    Personaly, I think it is a significantly larger fraction (say 1 in 100), but rather than get into an asserting contest with you, I am willing, arguendo, to accept whatever range you are comfortable with.
    Pick a number, or two.

    If you’re trying to argue that RMNS can explain proteins we observe you will eventually have to explain independent events. I understand in this condition that you are using, a fold, is a condition met in the second case of binding and your point is valid. The case this does not help is protein A binding to protein B binding to protein C etc which we observe as a general case in biology. Yes against the original case that is a goal post move and I will apologize in advance 🙂

  39. Bill, You appear to be conceding that you did not have an argument and that you are pretty much innumerate. Are you really unwilling to put a range on the proportion of random 80-mers that are capable of a consistent fold?
    I mean, you could say “Between 10% and one in 100 million”. That would be kinda lame, covering 7 orders of magnitude, but would show some honesty.
    You write

    If you’re trying to argue that RMNS can explain proteins we observe you will eventually have to explain independent events.

    No, not really. I suspect that you don’t understand ‘independent events’; I was trying to help you out with the whole 80mers thing, but you forfeited. OTOH, if you are going to argue that RM+NS cannot explain the proteins we observe, you are going to have to calculate conditional probabilities.

    I understand in this condition that you are using, a fold, is a condition met in the second case of binding and your point is valid. The case this does not help is protein A binding to protein B binding to protein C etc which we observe as a general case in biology.

    Sorry, I have no clue what you are trying to say here. I have no idea what point is valid. Could you re-phrase?

    Yes against the original case that is a goal post move and I will apologize in advance

    I will gladly move on to a new topic of conversation, if you wish. Recognize that any audience will assume you are conceding defeat. Do you now wish to discuss protein-protein interactions, or regulatory networks, or something else?

Leave a Reply