Barry gets it wrong again….

Here.

 

The null hypothesis in a drug trial is not that the drug is efficacious. The null hypothesis is that the difference between the groups is due to chance.

No, Barry.  Check any stats text book.  The null hypothesis is certainly not that the drug is efficacious (which is not what Neil said), but more importantly, it is not that “the difference between the groups is due to chance”.

It is that “there is no difference in effects between treatment A and treatment B”.

When in hole, stop digging, Barry!

If the drug is in fact efficacious, there will be a “real” difference between the two groups.

Wrong again! If the drug is in fact efficacious, you may or may not get a difference between the groups, but if your study has enough power to detect an effect the size of the “real” effect, you probably will.

However, whether or not your drug is efficacious you will get a perfectly “real” difference between your groups.  It just won’t necessarily be due to the drug.

How do you know if there is a real difference. By ruling out the chance explanation, as Professor Thisted says.

But he’s wrong, or rather, sloppy.  What you do is reject the null hypothesis that the there is no effect, and therefore that any between-group differences is simply due to random sampling variance.

Fourth, the “chance” at issue is not the noise in the sampling. I mean, this statement is absurd on its face. If group A takes the treatment and group B takes the placebo, what is being measured when they report back different results?

Yes, the “chance” at issue is indeed noise in the sampling.  That’s why we call it “sampling error”.  And what is being measured when different results are reported are changes in whatever symptoms or signs are supposed to be affected by the drug. What is being tested is whether the differences between the mean changes in each group are probable under the null that there is no effect of the drug.

Ask yourself this question. If the treatment is not effective, what difference would you expect between the two groups?

The answer to this question is a probability distribution based on the postulated effect size (difference divided by variability) and the size of the sample.

Of course, you would expect their response to be roughly equal.

You would expect the differences to vary each time you did the study, and for those differences to have a probability distribution with a mean of zero.

But no two groups are ever going to be exactly equal. Random differences between the groups will result in some difference. A statistical test starts with this assumption (the null hypothesis): There is no difference between the two groups and any difference that is reported is due to chance (i.e., the “chance explanation).

Sloppily worded, yes.  Better worded: that any difference between the two is due to unmodelled variance.  But your “chance explanation” is still not your null.  It’s the sloppy wording for the explanation for the difference between the groups.  These two things are not the same.  Your null – what you reject if your p value is low enough – is the hypothesis that there is no difference between the treatments.

The statistical analysis then determines whether that null hypothesis is rejected.

Yep.  Well, “the null”.  You’ve still got the null wrong.  It isn’t “the chance explanation”.

 

In other words, if you reject the chance explanation, you are left with the conclusion that the best explanation for the data is that the drug is efficacious.

Nope. What you reject is the null hypothesis, which is that there is no difference in the effects of the treatments.  You are still confusing the null hypothesis (what you retain as the truth if you retain the null) with an explanation for the difference you observed between your samples (for which “chance” is a shorthand for “unmodelled variance between the samples).

Finally, while Neil is wrong about the “sampling noise” being the “chance” that is tested, there is such a thing as sampling noise.

Neil is absolutely correct, not surprisingly, being a mathematician and a teacher of statistics, and of course there is such a thing as “sampling noise”.

There is a chance that the sampled population does not truly reflect the real population. Generally, the larger your sample size, the smaller this risk is but it cannot be eliminated completely. In other words, there is a “chance” that the “chance explanation” is correct even though your test says it should be rejected.

No, you have got this garbled.  Yes, indeed, “there is a chance that the sampled population does not reflect the real population” although the correct terminology is “there is a chance that the sample mean (or other summary statistic) will not reflect the mean in the population from which the sample is randomly drawn” and yes, the larger the sample, the closer the sample mean is likely to be to the population mean.  But you have garbled your “other words”, because you still have not understood what the null hypothesis is.  It is not “the chance explanation” for the difference between your two means (in this case).  It’s the hypothesis that the means in the population from which they are drawn is not different.

That risk is measured by the “p-value” Professor Thisted is discussing in his paper. A low p-value means the chance of your analyses being wrong is low. How low is low enough to rely on the test? There is no universally accepted answer to that question. Generally, however, a p-value of 0.05 or less is said to be “statistically significant,” which means that for practical purposes the sampled group can be assumed to be reflective of the population as a whole.

No, this is completely wrong.  If your p-value is below your criterion “alpha” of .05 or whatever, that means that you can reject your null, whatever that was.  It doesn’t mean that your sample mean is close to the population mean.  It means that your sample mean is a long way from the mean you postulated your sample to be drawn from under your null hypothesis.

Barry, you won’t understand this until you know what a null hypothesis is, and you won’t find that out until you stop thinking that it is either “chance” or “the explanation for why your sample is different from your population”.  It isn’t either of those things.However, you are not the only IDist to misunderstand null hypothesis testing. I know a man with two PhDs who also fails to understand it properly (though he gets it a little less wrong than you do).

60 thoughts on “Barry gets it wrong again….

  1. Let’s take an example that more closely reflects Barry’s concerns:

    Let’s say we find a comprehensible message typed on a sheet of paper left in a typewriter in a monkey cage. Nobody but the monkeys had access to the typewriter. Was the message an intentional message from the monkeys, or “due to chance”?

    How do we frame this question as a null hypothesis?

    Our null hypothesis is: the message was typed by the monkeys who had no idea what they were typing.

    And we test it by figuring out how frequently monkeys messing about with a typewriter would come up with a comparable message. This could be done experimentally, or by some kind of simulation.

    If only one message in 1000 comes out intelligible, and this is the only time the a typewriter has been left in a monkey cage, we can reject the null with a high degree of confidence – p<.001. Clearly the message was not typed by unintelligent monkeys.

    However, if one in 10 messages come out intelligible, or the experiment has been done a few thousand times, we do not reject the null – we “retain” it.

    We might informally say that the message was typed “by chance” or even that the “chance hypothesis” is likely.

    But it isn’t what we formally tested. What we formally tested was the null hypothesis that the message was typed by monkeys who had no idea what they were typing. And this was retained as the explanation.

    What happened “by chance” as that we drew a random sample of monkey output in which, unusually, the message was comprehensible. But that was never the null hypothesis – it is the reason we retained the null, but it is not the null itself.

    This is not a trivial difference, although it may not matter in informal talk. But it is fundamentally difference and conflating the two is an error that is propagated through a great deal of ID arguments, and essentially underlies the ID fallacy (or at least Dembski’s CSI fallacy).

  2. I have been looking in occasionally on these many threads with some amusement. It is clear that the people over at UD seem to be steeped in the tactic of rote memorization of definitions and simply bending definitions to fit whatever they think advances an argument for them.

    One learns about probability and statistics by doing; not by memorizing. It has to start with the notions of probability, probability distributions, and how distributions occur.

    Statistics courses and notation have become more standardized in recent years, and this has helped tremendously in keeping concepts straight.

    Most of the major professional statistical societies and the computer programs used in statistical work have been making use of standardized terminology and notation; making distinctions between things like the parameters of a population (or series of events) and sample statistics even to the point of using Greek letters for parameters and Roman letters for statistics. One makes inferences from statistics about the parameters of a population (or series of events).

    After one has dealt with elementary probability concepts and has established the various measures of central tendency and spread in a distribution, one of the more effective approaches in getting students to learn the concept of a “statistic” is to actually have them sit down and write a “research proposal” describing how they will go about determining the various parameters of a population or series of events you are about to give them. At this stage it often helps to remind them to label parameters with Greek letters and statistics with Roman letters.

    The earliest issues to come up involve the very definition of a “statistic” and how it relates to the parameters of a population. After students have established in their minds what it is they are trying to measure, the very next issue that always comes up is how to sample; in other words, how do you know if your sampling strategy is going to get what you want, and how do you know your sampling strategy is truly random.

    The next issue that comes up is the number of samples. Obviously, if one can look at every member of a population (or every event), one can simply tabulate means, medians, standard deviations, quartiles, or anything else that might be of interest. The problem comes when one can’t look at every member of a population (or every instance of a specified series of events).

    It is at this point that students begin to grasp the difference between population parameters and sample statistics and how sample statistics behave with number of samples and sample size.

    I found with the students in my statistics courses – admittedly very bright students – that a proposal writing stage of doing an experiment focuses their minds tremendously; they are going to actually go after the data they need to test some hypotheses. Knowing how their sample statistics behave with sampling strategies and size becomes a central priority in their thinking.

    It’s no longer memorized definitions and fuzzy concepts; their strategies and techniques are directed at doing some real experiments and collecting some real data.

    A big part of demonstrating the effects of sampling comes from using data loggers interfaced with computers and graphing calculators, because sampling can be done rapidly as students watch the various sampling distributions building up in real time.

  3. A serious problem seems to be that Barry believes ID rhetoric, that events are due to chance, necessity, or intelligence, and you can divine that the latter is responsible if you can rule out the other two (in tandem or singly). But there’s no cause called “chance,” it’s just what we label the partly known, or possibly the unknown. Their view is completely wrong, a modification of ancient mistaken understandings, and it is foundational for at least much of ID.

    Even that isn’t as bad as looking at regularity, like the patterns of splitting of metazoa into genetically-separated groups, and the broad outlines of the succession of life through time, and, rather than concluding “necessity” (or more correctly the limits of known processes), they conclude intelligence–which everyone knows should transcend the limits of “necessity,” in this case, evolution by known means. This is where even their faulty system could work if they were willing to use it properly, but it doesn’t because they won’t allow the proper inference.

    But, getting back to “chance” per se, Barry likely won’t give up his faulty view of chance simply because it is the improper “worldview” concocted to prop up ID. He’s not open to learning appropriate epistemology and inference.

    Glen Davidson

  4. Mike Elzinga: After one has dealt with elementary probability concepts and has established the various measures of central tendency and spread in a distribution, one of the more effective approaches in getting students to learn the concept of a “statistic” is to actually have them sit down and write a “research proposal” describing how they will go about determining the various parameters of a population or series of events you are about to give them.

    Yes, I agree. One of the main sources of trouble is poor experimental design. Having students actually design experiments, then critiquing their design, can reduce the problem.

  5. Hm, can I suggest that you’re being slightly unfair to Barry at the start of this post? The null hypothesis (as written) is that there is no difference between treatments in a particular statistic (e.g. the mean), which is usually the same in practice as “there is no difference in effects between treatment A and treatment B”. But because of stochasticity, what we measure for treatments A and B will (almost certainly) be different. Thus, one could loosely say “The null hypothesis is that the observed difference between the groups is due to chance”.

    BTW, the differences don’t just have to be due to sampling variation. I’ve got a manuscript I need to review where they have measurements from the whole population, so there is no sampling variation. But there is still stochastic variation (in this case due to evolution), which means it is reasonable to ask whether the patterns they are looking at are due to specific factors, or due other evolutionary effects, which are (in effect) noise.

  6. I am with Bob on this. I hadn’t read Barry’s comment until just now but it doesn’t seem that bad, although some of the wording could be more precise. What strikes me is that his account amounts to conceding the point that “chance” by itself is not an explanation or a hypothesis – it only makes sense in the context of a probability model.

  7. Thanks for stepping up to the plate for Barry, Bob.

    I take your point that in some circumstances “there is no difference between treatments” is equivalent to “any observed difference between treatments is due to chance” (although I’d still maintain the second is sloppy, because by “chance” we simply mean “factors we have not modelled”), but the second is a nonsensical way of expressing the null hypothesis – because it would make nonsensical the definition of the p value, as the probability of seeing values as or more extreme if the null were true.

    It would become:

    “The p value is the probability of seeing differences as or more extreme than the observed differences if the differences you observed were due to chance.”

    Which is at best tautological.

    In other words, you cannot substitute “the differences you observe are due to chance” for “there are no differences between treatments” in a sentence about the null hypothesis. Therefore it isn’t an equivalent statement of the null.

    What is true is that if the null is true, then observed differences between samples are due to chance/unmodelled variance.

    Interestingly, if we take drug treatments, the observed differences could indeed be due to the drug, even where the null of no mean difference in the population is true. For instance, let’s say that in the population 1/3 of people have a genome that renders the drug effective, 1/3 a genome that renders is ineffective, and 1/3 a genome that renders the drug counter-productive (makes the disorder worse).

    The null is true in this instance – in the population mean improvement on the drug is zero. However, if, in your samples, you happened to get a mostly people from the first two groups (which is perfectly possible with random sampling) then you’d see a difference in the effect of treatment between the two groups which was perfectly real, and perfectly correctly due to the drug.

    In other words, saying the differences is are “due to chance” isn’t really an explanation anyway, which was my original point. Chance doesn’t cause things, things do. But if we don’t control for which things are doing what, we can lump them all into the error term and call them “chance”. Chance is still the part we are not explaining, not the part we are.

    To take your second point:

    BTW, the differences don’t just have to be due to sampling variation. I’ve got a manuscript I need to review where they have measurements from the whole population, so there is no sampling variation. But there is still stochastic variation (in this case due to evolution), which means it is reasonable to ask whether the patterns they are looking at are due to specific factors, or due other evolutionary effects, which are (in effect) noise.

    Indeed. But again “noise” is just a shorthand for “evolutionary factors we haven’t modelled”. It isn’t a causal factor in itself.

  8. Mark Frank:
    I am with Bob on this. I hadn’t read Barry’s comment until just now but it doesn’t seem that bad, although some of the wording could be more precise. What strikes me is that his account amounts to conceding the point that “chance” by itself is not an explanation or a hypothesis – it only makes sensein the context of a probability model.

    Well, if he is conceding the point that chance by itself is not an explanation or hypothesis, then that’s fine. I don’t think he is, however,

    And it remains the case while “the means are not different in the population” (if that is your null) might be true, and, as a result “any differences between sample means is due to chance (i.e. because of random sampling)”, those claims are different. One may follow from the other, but that doesn’t mean that they mean the same thing, any more than “it’s raining” means the same thing as “I won’t be able to mow the lawn”.

    And it is really important to formulate your null correctly. Failure to do so is at the heart of ID’s ele(P(T|H)ant problem.

    (Not sure why the margins are borked in this thread – anyone know enough html to fix it?)

  9. Lizzie,
    I agree that Barry’s use of language is sloppy, but he’s a lawyer, not a statistician. Poor fellow.

    I should probably have written “The null hypothesis is that any observed difference between the groups is due to chance”, which would fit more closely to classical significance tests. Although I think what I wrote would be appropriate (if still sloppy) for a member of the likelihood school.

    I don’t necessarily have a problem with using chance as an explanation, as long as it’s understood as implying “we don’t know”. It does perhaps carry “and we don’t think it’s important enough to find out” as an assumption: we can say that a coin toss came up tails because of chance, because we don’t want to work through the physics behind the toss. As you write, it’s not a causal factor of itself: there is causality, but it’s unobserved.

  10. I don’t especially mind sloppy language in a lay discussion.

    But what is odd is Barry telling me I’m wrong when I’m pointing out that it is sloppy, and that the claim is potentially misleading.

    Which it is. It’s misled Dembski, and it continues to mislead IDers.

    And stats students around the globe.

  11. Bob O’H: BTW, the differences don’t just have to be due to sampling variation. I’ve got a manuscript I need to review where they have measurements from the whole population, so there is no sampling variation.

    As used in medical research and in the social sciences, “the whole population” is an idealization that refers to all humans, including those not yet born and those long since dead. So those currently living are but a sample.

    For voting in elections, the whole population of concern consists those who will vote in the actual election. In that case, if one has the whole population, one just tallies up the votes and does not do statistical analysis.

  12. The question Barry posed: ‘Would you regard 500 heads as suspicious?’ is actually an interesting one. Any one care to explain why ?

  13. Suspicious that the coins may not be fair, that it is a set up, that it was something other than 500 throws of a fair coin.

  14. graham2:
    The question Barry posed: ‘Would you regard 500 heads as suspicious?’ is actually an interesting one. Any one care to explain why ?

    Where does B Arrington Esq say that exactly? I’m not seeing it in that first post of his which started this discussion.

    I’ve read hundreds of the comments by the sycophants at UD, spread out over the several threads that followed, so maybe I just missed it in the chaff.

    Or are you trying to say that Barry’s question implies “suspicion” without him actually saying so?

  15. Barry (well actually it all started with scordova), claimed that 500 heads suggests intervention by some agent, and not a fair throw of coins. I think its all related to the fact that genomes, protein sequences, etc all ‘look’ designed, therefore god, or something like that. Its a lot of smoke and mirrors, but the initial question was interesting. If we saw 500 heads we would (rightly I think) suspect that it wasnt a fair throw, but why do we think that way ?

    If I saw it in a casino (and I lost my shirt) I would certainly think that way.

  16. graham2: Suspicious that the coins may not be fair, that it is a set up, that it was something other than 500 throws of a fair coin.

    Fairness about what? A setup for what?

    It was explicitly stated that there were no throws “(no tossing involved)”.

  17. It all goes back to sal on 15-DEC ….. The Fundamental Law of Intelligent Design ….. If I had 500 fair coins in a box all heads, I would conclude the 100% proportion of heads is far away from the expected value of 50% heads, thus we have a significant violation of the law of large numbers for random processes, thus a random process is rejected as the mechanism of creating the all-heads pattern.

    You also have to realise that sal has rejected the idea of natural selection.

    .

  18. graham2:
    So why do we regard 500 heads as suspicious ?(Im not game to use the word ‘chance’)

    Humans have evolved amazing patten matching skills over the last few million years as a survival trait. If you could recognize the patterns of the animals you hunted (or the ones hunting you) you survived a lot longer. Identifying a pattern meant there must have been a conscious behavior from some sentient “thing” that caused the pattern. The flip side of this skill (no pun intended) is that we also get a lot of false positives. That’s why people see faces in grilled cheese sandwiches.

    Another quick example. Ask someone to identify which is more likely to be the output from a random number generator, 389417 or 444444. Almost no layman will pick the latter because it’s obviously a “meaningful” pattern even though the probabilities of the two numbers are the same.

    IDiots like to exploit this human tendency to conclude pattern match = intelligent pattern former by trying to give fancy sciency-sounding terms and bogus probabilities to events. But no matter how they twist it it’s still “looks designed to me!” with no actual indicators of design.

  19. keiths: I agree with your linked stuff. I just wanted to see what the nerds here thought. Its an interesting problem though, what seems so obvious (it looks suspicious) is not so obvious when you think about it.

    So now we are on a roll, lets knock over the Monty Hall one.

  20. Meanwhile, Sal gets it completely wrong:

    So deeply hard wired is “all coins heads” that our math which resulted from our hard-wired thought processes, spits out things like expectation values and averages and thus our math automatically makes “all heads” special. As if our math were some universally immutable truth.

  21. If you ask people to write down a “random” sequence if Heads and Tails, they will write down something like HTTHTHTHHTHTTHTTH… We know that independent 50:50 tosses will result in a change from H to T or from T to H half of the time. But the above sequence has 17 tosses, so 16 possible places that it could change, among which there are 12 changes. That sequence is atypical of actual independent tosses.

    Runs of H or runs of T are far too underrepresented in the “random” sequences people write down. So when they see an actual random sequence it strikes them as having too many runs. That is why people believe too easily in “runs of luck”.

    An interesting complication on this is that people have heard of the “laws of probability” and that the fraction of Heads converges to 1/2. But they misunderstand how that works. I once asked a large undergraduate class the following question: you toss a fair coin independently 10 times, and all 10 tosses happen to come up Heads. If you toss it again, what is the most likely outcome?

    Half of the class voted for Heads, “because you have a run of Heads going”. But the other half of the class voted for Tails, “because of the laws of probability”. Almost no one was happy with my answer, which was that Heads is still 50% probable. Apparently many people think that “the laws of probability” is a force in the universe actively pushing toward 50% Heads!

  22. Joe Felsenstein,

    Great point, Joe. It is instructive to translate your made-up sequence from the heads-tails representation to a representation in terms of changes (C) and non-changes (N). The first toss resulted in an H and the second in a T, so the results changed (C). The third toss was again a T, so the results did not change (N).

    Here is the translation:
    HTTHTHTHHTHTTHTTH = CNCCCCCNCCCNCCNC
    In the new language, the sequence has 12 Cs and only 4 Ns, a rather unlikely outcome for a random process.

  23. graham2,

    The short answer is that getting any sequence specified in advance is suspicious unlikely.

    Any specific sequence of N coin tosses has the probability of coming up (1/2)^N. For a large N, that is highly unlikely. The probability that this particular coin sequence does not come up is 1−(1/2)^N. For a large N, this is very likely.

    Let’s try several sequences next. The probability that one of, say, 4 specific sequences comes up is 4×(1/2)^N. They won’t come up with the probability 1−4×(1/2)^N. For a large N, the former is still highly unlikely and the latter is still highly likely.

    However, some sequence will inevitably come up. The probability that any one of the sequences comes up is 2^N×(1/2)^N = 1, i.e., a certain outcome. The probability that none of them will come up is 0.

  24. Over at UD Sal Cordova tells us clearly what ID gets wrong:

    Sal: “pre-defined sequence” is referred to as “independent specification” in ID literature. Specification is not limited to sequences. For example a blueprint of a skyscraper is not really describing a sequence, but a skyscraper.

    “Independent” is used instead of pre-defined, to cover situations where the specification came about independently but after the design in question was observed by someone.

    If you come up with the details of a genome only AFTER you’ve observed the genome then it’s not independent.

    If you come up with the details of a genome only AFTER you’ve observed the genome then you have a description, not a specification.

    ID is just the Sharpshooter’s Fallacy dressed up with sciency-sounding language, nothing more, nothing less.

    Thanks for that admission Sal!

  25. olegt,

    Yes, and if each flip is independent with P(H) = P(T) = 1/2, then P(C) and P(N) are also both equal to 1/2.

    Under those conditions, the probability of a length 16 sequence with 4 Ns or fewer is less than 4 percent.

  26. Neil Rickert (over at UD) explained the ‘all heads’ paradox by suggesting we label the coins with many different symbols, not just the 2, so ALL sequences look random. Then when we see the ‘all Heads’ sequence it looks just as random as the rest, and we are not suspicious in the least.

    I like that.

  27. graham2,

    That’s not what Neil is saying, and anyway it doesn’t resolve the paradox.

    Neil wrote (to Salvador):

    Instead of using heads and tail, take a marker and put an X on one side of each coin and a Y on the other side. You can do this in a mixed up way, so that the X is on the heads side of some coins and on the tails side of the others.

    The mathematics works just as well with the X and Y as it does with the heads and tails. And, in a sense, that’s the whole point of mathematics being abstract. The mathematics doesn’t make the X special, and it doesn’t make heads special. Maybe you are making them special.

    He’s only suggesting the addition of two symbols, X and Y, not many. Note that there will still be some sequences that look “suspicious” (all Xs, for example) and others that look “random”. It’s just that the ones that look “suspicious” with respect to X and Y won’t be the same as the ones that look “suspicious” with respect to heads and tails.

    No matter how the coins fall physically, there are some labeling schemes that, if we had employed them, would have caused us to be suspicious of the sequence we got. Therein lies the paradox: why should we be suspicious under some labeling schemes, but not others, when exactly the same thing is happening physically?

  28. Here’s the resolution as I expressed it in my OP on the subject.

    It still seems arbitrary and subjective to divide the 9-digit numbers into two categories, “significant to me” and “not significant to me”. Why am I suspicious when my own SSN comes up, but not when the SSN of Delbert Stevens of Osceola, Arkansas comes up? His SSN is just as unlikely as mine. Why doesn’t it make me just as suspicious?

    In fact, there are millions of different ways to carve up the 9-digit numbers into two sets, one huge and one tiny. Should we always be surprised when we get a number that belongs to a tiny set? No, because every number belongs to some tiny set, properly defined. So when I’m surprised to get my own SSN, it can’t be merely because my SSN belongs to a tiny set. Every number does.

    The answer, I think, is this: when we roll Delbert’s SSN, we don’t actually conclude that the die was fair and that the rolls were random. For all we know, we could roll the die again and get Delbert’s number a second time. The outcome might be rigged to always give Delbert’s number.

    What we really conclude when we roll Delbert’s number is that we have no way of determining whether the outcome was rigged. In a one-off experiment, there is no way for us to tell the difference between getting Delbert’s (or any other random person’s) SSN by chance versus by design or some other causal mechanism. On the other hand, rolling our own SSN does give us a reason to be suspicious, precisely because our SSN already belongs to the tiny set of “numbers that are meaningful to me”.

    In other words, when we roll our own SSN, we rightly think “seems non-random”. When we roll someone else’s SSN, we think (or should think) “no particular reason to think this was non-random, though it might be.”

  29. It’s probably worth noting that while any one sequence of coin tosses (including 500 heads) is as probable as any other, 100% Heads is very much less likely than 50% Heads. The same applies to the Xs and Ys drawn on the coins.

  30. To keiths: Yes, I twisted Neils explanation a bit, and yes, just after posting, I realized that, as you say, we can still get suspicious sequences, just different ones. Groan.

  31. One interesting question then is: if you saw a sequence of 500 coins being tossed, and the order was an exact alternation of heads and tails, would you suspect that something weird was going on? The proportions would be exactly as expected under the null, but the sequence would be very different.

    So the question then involves being very specific about the probability distribution expected under the null – probably a poisson distribution?

  32. Lizzie: One interesting question then is: if you saw a sequence of 500 coins being tossed, and the order was an exact alternation of heads and tails, would you suspect that something weird was going on? The proportions would be exactly as expected under the null, but the sequence would be very different.

    The answer is yes.

  33. Lizzie: One interesting question then is: if you saw a sequence of 500 coins being tossed, and the order was an exact alternation of heads and tails, would you suspect that something weird was going on?

    Yes, we should be exactly as suspicious. However, we are less likely to notice this case.

  34. Lizzie,

    It’s probably worth noting that while any one sequence of coin tosses (including 500 heads) is as probable as any other, 100% Heads is very much less likely than 50% Heads. The same applies to the Xs and Ys drawn on the coins.

    Yes, but an all-heads sequence might turn out to be, say, a 47% sequence in terms of X and Y, depending on the labeling. If we pay attention to heads and tails, we’ll be suspicious of such a sequence. If we pay attention only to Xs and Ys, we won’t be suspicious. Why are we suspicious in one case but not the other, when the same thing is happening physically in either case? That’s part of the paradox, and it’s why I think we need a resolution of the kind I described earlier.

  35. Some of the confusion over this issue might be coming from the conflation of a specified sequence with a combination.

    For a binomial distribution of two mutually exclusive states 1 and 2, in which State 1 occurs with a probability p, and State 2 occurs with a probability q = 1 – p, the probability of k instances of State 1 and Nk instances of State 2, if we don’t care about order, is

    C(N, k) p^k q^(Nk),

    where C(N, k) = N!/((N–k)!k!).

    Of course, the sum

    ∑ C(N, k) p^k q^(Nk) on k,

    with k going from 0 to N, is just

    (p + q)^N = 1,

    because p + q = 1.

    However, a specified sequence of k instances of State 1 and Nk instances of State 2, where order matters, is

    p^k q^(Nk);

    much different.

    But notice that all specified sequences with the same k and Nk have the same probability. And if p = q = ½, then all specified sequences have the same probability 1/2^N; but all combinations still don’t.

    It is not hard to make the probabilities temperature dependent and watch condensation into a “ground state.”

  36. Mike Elzinga: Some of the confusion over this issue might be coming from the conflation of a specified sequence with a combination.

    Yes, that’s part of the problem.

    Another problem I see in the UD comments, is with using precise language. Some people are disagreeing with one another because they are using imprecise language and interpreting it differently.

  37. I don’t see the paradox, myself. Yes, an all-heads sequences is exactly as likely as any other sequence, and so what? What makes the difference is the significance that we, as observers, choose to attach to whichever sequence. We humans, for whatever reason, attach more significance to all-heads than we do to HTTHTHHHHTHTTHTTHHH … HHTHTTTHTHTTH. Why do we attach more significance to one than to the other? That’s a matter for psychologists to unravel, methinks. It is in no way a logical paradox, it’s just how our minds work.

  38. cubist: We all seem to agree that its a human/psychological issue, nothing mathematical about it, but its still interesting … a very common cause of confusion.

  39. cubist,

    I don’t see the paradox, myself. Yes, an all-heads sequences is exactly as likely as any other sequence, and so what? What makes the difference is the significance that we, as observers, choose to attach to whichever sequence.

    It’s more complicated than that. If we see someone sit down and flip 500 heads in a row, we are genuinely justified in becoming suspicious. It isn’t merely a cognitive illusion like the Monty Hall problem.

    Yet the same person flipping a nice, random-looking sequence of heads and tails won’t arouse our suspicion. That is genuinely justified, too — otherwise we would have to be suspicious every time someone flipped a coin 500 times, regardless of the sequence produced!

    I’ll peel the layers of the paradoxical onion in a later comment or OP.

  40. Mike,

    Some of the confusion over this issue might be coming from the conflation of a specified sequence with a combination.

    I don’t think so, because everyone involved in the discussion seems to understand that there are more ways of getting a combination of 250 heads and 250 tails (zillions of ways) than there are of getting 500 heads (only one way).

    The fact that a 500-heads sequence belongs to a tiny set with one member is not, by itself, grounds for suspicion. If it were, then we would be justified in being suspicious of all sequences, because for any sequence we can define a tiny set that includes it and only it, complemented by a giant set that includes every other sequence.

  41. I’m sure one of the fine “scientists” at UD will give us the full explanation of how this 500 coin nonsense is positive evidence for the Design of biological life.

    Any day now.

    Any day now.

    Any day…

  42. keiths: The fact that a 500-heads sequence belongs to a tiny set with one member is not, by itself, grounds for suspicion. If it were, then we would be justified in being suspicious of all sequences, because for any sequence we can define a tiny set that includes it and only it, complemented by a giant set that includes every other sequence.

    Well, I think Cordova and a couple of others over at UD are in a competition to find a way of connecting this to an ID/creationist “second law of thermodynamics” from the “law of large numbers.”

    As near as I can figure out from all the confusion over there – the 500 coins is not a coincidence with respect to Dembski’s “upper probability bound” – they want to find some way of saying that an improbable arrangement of coins has something to do with abiogenesis being impossible. Cordova has already alluded to chirality.

    We still don’t have any answers from any of the ID/creationists as to why Scrabble letters, coins, dice, marbles, junkyard parts, and other such inert things are supposed to be representative of the properties and behaviors of atoms and molecules.

    Again, as near as I can tell, crystals form “naturally” according to some of their arguments. So what does that mean about other molecular assemblies? Is there some cut-off where the laws of chemistry and physics stop operating and all the atoms in complex molecules suddenly become inert?

    I have no idea where any of this is going, and I suspect none of the UD people know either; a lot of ADD going on over there. They seem to want to keep their CSI “improbability” calculations and somehow justify them with an ID/creationist 2nd law. Finding fault with their thinking about probability is driving them crazy.

  43. keiths:
    cubist,

    It’s more complicated than that.If we see someone sit down and flip 500 heads in a row, we are genuinely justified in becoming suspicious. It isn’t merely a cognitive illusion like the Monty Hall problem.

    It isn’t? Then what, exactly, is the difference? We both agree that us humans do tend to attach more significance to an all-the-same run of coin-tosses than we do to a randomly-distributed run of coin-tosses—but outside of the relative degrees of significance which we humans attach to them, what is the difference between the coin-toss runs?

  44. Neil Rickert: As best I understand it, he has refuted the strawman theory of evolution that ID proponents love to attack. And therefore ID just must be true. I guess that’s his version of null hypothesis testing.

    Ha! As I suspected, Cordova appears to be hanging his argument on homochirality because he thinks coin flips have something to do with chirality. Weird.

    Let’s see where this goes; it has the potential to be entertaining.

  45. You have to understand that sal has rejected natural selection … it just doesnt happen. Given this, if something looks designed (life, 500 coins), and there is no such process as selection, then voila: design.

    At least I think thats the line.

Leave a Reply