*(slightly edited version of a comment I made at UD)*

Barry Arrington has a rather extraordinary thread at UD right now, ar

Jerad’s DDS Causes Him to Succumb to “Miller’s Mendacity” and Other Errors

It arose from a Sal’s post, here at TSZ, Siding with Mathgrrl on a point,and offering an alternative to CSI v2.0

Below is what I posted in the UD thread.

I don’t think I’ve ever seen a thread generate so much heat with so little actual fundamental disagreement!

Almost everyone (including Sal, Eigenstate, Neil, Shallit, Jerad, and Barry) is correct. It’s just that massive and inadvertent equivocation is going on regarding the word “probability”.

The compressibility thing is irrelevant. Where we all agree is that “special” sequences are vastly outnumbered by “non-special” sequences, however we define “special”, whether it’s the sequence I just generated yesterday in Excel, or highly compressible sequences, or sequences with extreme ratios of H:T, or whatever. It doesn’t matter in what way a sequence is “special” as long as it was either deemed special before you started, or is in a clear class of “special” numbers that anyone would agree was cool. The definition of “special” (the *Speci*fication) is not the problem.

The problem is that “probability” under a frequentist interpretation means something different than under a Bayesian interpretation, and we are sliding from frequentist interpretation (“how likely is this event?”) which we start with, to a Bayesian interpretation (“what caused this event?”) , which is what we want, but without noticing that we are doing so.

Under the frequentist interpretation of probability, a probability distribution is simply a normalised frequency distribution – if you toss enough sequences, you can plot the frequency of each sequence, and get a nice histogram which you then normalise by dividing by the total number of observations to generate a “probability distribution”. You can also compute it theoretically, but it still just gives you a normalised frequency distribution albeit a theoretical one. In other words, a frequentist probability distribution, when applied to future events, simply tells you how *frequently* you can expect to observe that event. It therefore tells you how confident you can be (how probable it is) that that the event will happen on your next try.

The problem is arises when we try to turn frequentist probabilities about future events into a measure of confidence about the cause of a *past* event. We are asking a frequency probability distribution to do a job it isn’t built for. We are trying to turn a normalised frequency, which tells us the how much confidence we can have of a future event, given some hypothesis, into a measure of confidence in some hypothesis concerning a past event. These are NOT THE SAME THING.

So how do we convert our confidence about whether a future event will occur into a measure of confidence that a past event had a particular cause? To do so, we have to look beyond the reported event itself (the tossing of 500 heads), and include more data.

Sal has told us that the coin was fair. How great is his confidence that the coin is fair? Has Sal used the coin himself many times, and always previously got non-special sequences? If not, perhaps we should not place too much confidence in Sal’s confidence! And even if he tells us he has, do we trust his honesty? Probably, but not absolutely. In fact, is there *any way* we can be <absolutelysure that Sal tossed a fair coin, fairly? No, there is *no way*. We can test the coin subsequently; we can subject Sal to a polygraph test; but we have no way of knowing, for sure, a priori, whether Sal tossed a fair coin fairly or not.

So, let’s say I set the prior probability that Sal is not honest, at something really very low (after all, in my experience, he seems to be a decent guy): let’s say, p=.0001. And I put the probability of getting a “special” sequence at something fairly generous – let’s say there are 1000 sequences of 500 coin tosses that I would seriously blink at, making the probability of getting one of them 1000/2^500. I’ll call the observed sequence of heads S, and the hypothesis that Sal was dishonest, D. From Bayes theorem we have:

P(D|S)=[P(S|D)*P(D)]/[ P(S|D)*P(D) + P(T|~D)*P(~D)]

where P(D|S) is what we actually want to know, which is the probability of Sal being Dishonest, given the observed Sequence.

We can set the probability of P(S|D) (i.e. the probability of a Special sequence given the hypothesis that Sal was Dishonest) as 1 (there’s a tiny possibility he meant to be Dishonest, but forgot, and tossed honestly by mistake, but we can discount that for simplicity). We have already set the probability of D (Sal being Dishonest) as .0001. So we have:

P(D|S)=[1*.0001]/[1*.0001+1000/2^500*(1-.0001)]

Which is, as near as dammit, 1. In other words, despite the very low prior probability of Sal being dishonest, now that we have observed him claiming that he tossed 500 heads with a fair coin, the probability that he was being Dishonest, is now a virtual certainty, even though throwing 500 Heads honestly is perfectly *possible*, entirely consistent with the Laws of Physics, and, indeed, the Laws of Statistics. Because the parameter (P(T|~D) (the probability of the Target given not-Dishonesty) is so tiny, any realistic evaluation of P(~D) (the probability that Sal was not Dishonest) , however great, is still going to make the term on the denominator, P(T|~W)]P(~W), negligible, and the denominator always only very slightly larger than the numerator. Only if our confidence in Sal’s integrity exceeds 500 bits will we be forced to conclude that the sequence could just or more easily have been Just One Of Those Crazy Things that occasionally happen when a person tosses 500 fair coins honestly.

In other words, the reason we know with near certainty that if we see 500 Heads tossed, the Tosser must have been Dishonest, is simply that Dishonest people are more common (frequent!) than tossing 500 Heads. It’s so obvious, a child can see it, as indeed we all could. It’s just that we don’t notice the intuitive Bayesian reasoning we do to get there – which involves not only computing the prior probability of 500 Heads under the null of Fair Coin, Fairly Tossed, but also the prior probability of Honest Sal. Both of which we can do using Frequentist statistics, because they tell us about the future (hence “prior”). But to get the Posterior (the probability that a past event had one cause rather than another) we need to plug them into Bayes.

The possibly unwelcome implication of this, for any inference about past events, is that when we try to estimate our confidence that a particular past event had a particular cause (whether it is a bacterial flagellum or a sequence of coin-tosses), we cannot simply estimate it from observed frequency distribution of the data. We also need to factor in our degree of confidence in various causal hypotheses.

And that degree of confidence will depend on all kinds of things, including our personal experience, for example, of an unseen Designer altering our lives in apparently meaningful and physical ways (increasing our priors for the existence of Unseen Designers), our confidence in expertise, our confidence in witness reports, our experience of running phylogenetic analyses, or writing evolutionary algorithms. In other words, it’s subjective. That doesn’t mean it isn’t valid, but it does mean that we should be wary (on all sides!) of making over-confident claims based on voodoo statistics in which frequentist predictions are transmogrified into Bayesian inferences without visible priors.

Spot on.

I would only add, because it is a common source of confusion, that the Bayesian/Classical Hypothesis Test distinction is not exactly the same as the subjective/frequentist probability distinction – although they tend to be associated. You could imagine doing a Fisher/NP type of hypothesis test where the probabilities are subjective estimates. Suppose the data was that your husband was late for dinner and H0 was that he was not having an affair. You might restrict yourself to asking yourself (subjectively) how probable that he would be late for dinner if he was not having an affair – you might even have a rejection zone of more than 1 hour!

Similarly Bayesian priors can and sometime are frequentist. You may already have good frequentist data on the prior probability of a disease being present which is then amended by the likelihood of certain symptoms which have appeared.

Point taken.

I guess my main point (which is my mantra for people who want “help with my stats”) is: what is the question you want to know the answer to?

Then do the test that answers that question! Mostly, null hypothesis testing gives you an answer to a different question than the one you actually want the answer to.

I don’t think about it that way.

To me, probability is a branch of mathematics. When we use it for real world problems, then we are constructing mathematical model. Looked at that way, we can ask “what does the model say?” We can also ask “Is this model a good fit to the real world problem that is being modeled?”

That all sequences have the same probability, comes from the mathematical model. But when I was asked about a hypothetical real world use, I did say that I would wonder whether there was some hanky panky, and there I was implicitly questioning whether the mathematical model fits.

I usually avoid the probability arguments about ID, precisely because the usual ID arguments depend on very poorly chosen probabilistic models. Sal’s attempt to model homochirality in terms of coin tossing is a laughably absurd example of that.

This means that unique events in the past are outside the reach of science. Then ToE it is not a scientific theory but a historical theory.

Yes, I know. That’s why I kept saying that everybody is correct, they just all meant different things by similar words, even simple claims like “the coin is fair”. From a mathematical point of view it means one thing (“let us define the coin as fair…”) , while from an empirical point of view it means something quite different (“There is a non-zero probability that the coin is fair”).

Eigenstate (and I, at first) looked at it the first way, others looked at it the second way. Actually it turns out that Sal meant it the first way, but didn’t intend us to assume that the coins had actually been tossed at all, which we assumed (or I did).

And lots of people intuitively did a Bayesian thing (as I think we evolved to do, to hark back to our previous conversation!) and said: hey, come off it, Darwinists, sheesh, what a ridiculous idea, of course you can infer skulduggery from a sequence of 500 Heads!

Which is also true.

No, it doesn’t. A historical theory can be a perfectly good scientific theory.

How do you measure the statistical confidence?

Well, I sort of explained it in the OP! You calculate it from frequencies of past events (or from a theoretical frequency distribution), and, depending what it is you want to be confident about, you might compute “confidence intervals” from those frequency distributions, or plug them into a Bayesian equation.

Neil can probably do a better job of explaining.

Here is the central problem in Sal’s post.

What possible justification does Sal have for using a binomial distribution for the emergence of homochirality in the molecules of life?

What is the probability distribution of a one-off event – as far as we humans know – that resulted in the emergence of living organisms and evolution?

Nevertheless, this it typical of ID/creationist “calculations” about what is “improbable.”

However, knowing what we do about atoms and molecules, we do have some crude estimates of probabilities, as I tried to explain here. But we don’t have a distribution, and we certainly can’t use a binomial distribution.

A Bayesian analysis can be done with this kind of information, but we still have very crude estimates for probabilities because we don’t yet know the recipe(s) for producing the molecules of life.

“And that degree of confidence will depend on all kinds of things, including our personal experience, for example, of an unseen Designer altering our lives in apparently meaningful and physical ways (increasing our priors for the existence of Unseen Designers), our confidence in expertise, our confidence in witness reports, our experience of running phylogenetic analyses, or writing evolutionary algorithms. In other words, it’s subjective.”

What is the binomial distribution of a single coin toss?

I have already thrown in my 3 cents regarding homochirality – the banalysis*** proposes some illusory mix of the 20 modern acids plus their 19 enantiomers, yet magically swept clean of all other amino-acid contaminants, and some process joining them that is equisitely discriminatory on side-chain, and can tell alpha-acids from beta etc, but cannot even tell which side of the molecule that side chain is on! How is that proposed to work?

Suppose I have a racemic mix containing these 6 amino acid side chains 1, 2 and 3 carbon atoms in length, here arranged in stereochemical pairs:

-A A-

-BB BB-

-CCC CCC-

Suppose the sequence -A / -BB / -CCC is ‘useful’. Your peptide bonding specification process can distinguish between the 3 side chains, yet it

cannotdistinguish between -A and A-… What chemical principle underlies this curious mix of specificity and blindness? Is it distinguishing acids on molecular weight? If so, how does it exclude other molecules of the same weight while still maintaining side-chain specificity? Hydrophobicity? Charge? Same problem.The reality is that the issue of chirality is a a subset of the wider issue of discrimination, and solved by it (granted, the problem of original specification remains unsolved by this analysis, but I am specifically addressing Sal’s ‘coin-toss chirality’). There is no plausible chemical system offering such a coin-tossing version of peptide synthesis while also offering side-chain specificity. Once you have a means of identifying an alpha-amino acid (by virtue of its -C and -N terminals existing a certain distance apart in a certain orientation), identifying its side-chain

specificallynecessarily means identifying one enantiomer only, by shape/charge distribution.Whose peptide-synthesis scenario are you attempting to undermine by bringing coins into it? Hoyle’s?

*** A typo, but it stays!

We don’t really measure it. We make a mathematical (probabilistic model), and we specify a confidence interval as part of that model.

In a typical case, we might take a 95% confidence level. We design a statistical test, based on our model. The mathematics then tells us that, in random sampling assuming our model, the results will fall within a particular range with probability 0.95 (or 95% of the time). If the results do not fall within that range, we conclude that our mathematical model was a poor model, with 95% confidence.

This depends on careful design, so that showing the model to be probably wrong is what you want to show. And it requires designing the model before the actual experiment.

This does not work at all well with analysis of historic data, where the experiment was done before the model was constructed.

Allan Miller,When there is no “internal” mechanism for such a process, the most reasonable place to look is to external conditions that set a ‘bias” on processes that “breaks the symmetry.”

One such line of study has been to determine the effects of the spin polarized electrons from beta decay on these reactions. These studies have been going on since the 1970s. If you type “homochirality and beta decay” into your search engine, you can come up with lots of these studies.

Possibly, but you don’t have to go far from the acid to find a biasing agent: the stereospecificity of macromolecules themselves. There are plenty of problems with ‘proteins-first’ IMO, chirality the least of ’em!

The point being, which I didn’t make clear, that the antiparallel arrangement of the nucleic acid double helix is itself an enantioselective agent … of itself. Complementary sequence will hybridise, and this preferentially selects homochiral ribose polymers and pairable bases from a more general mixture. Both D and L sugar forms can arise, but each is preferentially selected by complementarity to be homochiral. You don’t need mineral surfaces, or elaborate physics, simply a pendulum that can only swing hard left or hard right, and loss of one extreme (which can happen by chance, if nothing else).

Indeed.

To my knowledge, all of these experiments with beta decay are inconclusive. Not enough sensitivity for one thing; and if there is an effect, it is swamped by the chemistry.

One of the first experiments along these lines took place back in the 1970s in Art Rich’s group at U of Michigan. I was in that group at the time; and the experiment was referred to as “The Vatican Experiment” by the guys working on it.

I have not kept up with these studies, but I think that continued studies along this line have tried to find a point in the chain of chemical events that was likely to be particularly sensitive and would thereby tip the scales in favor of downstream events. I don’t know how any of these studies have progressed; I think other studies have higher priority.

Getting funding for these experiments is difficult because they are considered very low priority in the funding agencies. I don’t know what NASA’s budget includes.

Allan,

Something similar happened to me yesterday. I was trying to type

…and it came out of my fingers as…

…which actually seems pretty accurate.

(I ended up not posting the comment, for different reasons.)

Yes, sometimes a typo can generate nifty associations or neologisms – by chance rather than design! Might see if I can get ‘banalysis’ to catch on. And I still giggle at ‘hyperklink’! 🙂

Great OP! You have an asterisk that should be a space in the first statement of the theorem.

Bouncing back and forth with UD is getting ludicrous.

Then I am right. Unique events in the past are outside of science. ToE isn`t scientific is historical.

Is both. At the same time.

No, that does not follow at all. What does not work well with historic data is traditional statistical hypothesis testing. But that’s only one of many ways that science can reach conclusions.

If you cannot test statistically your hypothesis, is it science?

Consilience. Learn the word.

Right, a word from historical methods. You can also use in science, but how do you measure scientifically each of your consilient hypothesis?

Especially when it spills onto Jeffrey Shallit’s blog too! Reminds me of this.

Thanks! Stray asterisk removed.

I reference some anecdotes from:

Holy Rollers, Pascal’s Wager, and If ID is wrong it was an honest mistake

Skilled gamblers are often called Advantage Players (APs).

In practice in professional Advantage Player teams (of which Blackjack Teams are the most prominent), the issue of cheating comes up. As I’ve read the anecdotes of teams that threw out members of for cheating, Bayes theorem was usually not the method used to determine cheating, but deviation from expectation.

Perhaps that’s from operational convenience, but it suffices because the frequentist approach often works and the statistics are less controversial because the sample sizes are enormous. In the case of professional blackjack, teams were probably dealing on the order of millions of hands played to determine if a player is skimming profits rather than honestly pooling them with his teammates.

Honesty surely had a role in the success of the Holy Rollers. The members had a sacred obligation to be honest to one’s teammates especially since 70,000 in a few hours of play since to earn money at an expectation 70,000 one-standard deviation to the downside! The variance to reach their accumulated winnings of 3.5 million must have been sickening.

The team counted on honesty that the player actually lost $70,000 and didn’t skim the money. If there wasn’t honesty, the team would have easily failed.

There was a stretch where the team underperformed and it was suspected people were skimming profits. It was decided that one player could not be trusted and they fired him (curiously it was the one atheist on the team, so there is probably a cultural dynamic in the decision to let him go), but the larger explanation was quality control. The team was recruiting insufficiently skilled gamblers.

Not that I had some major point, but why choose the Bayesian approach when a simpler avenue might suffice in practice? My “siding with mathgrrl” post argued for using simplicity when it is available.

I can see using Bayes if one has a limited sample size, but in questions of OOL, it seems the frequentist approach would work just as well to establish a defensible (not absolutely proven) hypothesis.

Sal,

Lots of confusion in that sentence.

First, your ‘expected number of heads’ criterion misses important cases, as eigenstate pointed out ten days ago:

That is an

extremelysuspicious pattern, yet your criterion says that there’s nothing fishy at all about it.Also, it’s not true that the teams didn’t use Bayesian reasoning in deciding to kick members out for cheating. They did, whether they (or you) realize it or not. They started out believing that the members were not cheating (or at least that there was no reason to think they were cheating). As the evidence came in, they revised that belief and the degree of certainty with which they held it. Eventually they concluded that the members

werecheating.That’s a textbook Bayesian inference.

“That is an extremely suspicious pattern, yet your criterion says that there’s nothing fishy at all about it.”

Not true, it says nothing about it period, it doesn’t say fishy or not. Biniomial distributions don’t look at specific ordering of sequences, so any inference to fishy or not fishy is unwarranted. You’re attributing an argument to me which I didn’t make.

In the video documentary of the Holy Rollers there was clip of a conference. They discussed there was a 4 or 5 sigma deviation from expectation and therefore something had to be done.

“Also, it’s not true that the teams didn’t use Bayesian reasoning in deciding to kick members out for cheating. They did, whether they (or you) realize it or not.”

Then by that standard alternate approaches that are not explicitly Bayesian can be said to be implicitly Bayesian, hence there is not much point in framing the discussion as Fisher vs. Bayes.

Amongst my AP brethren, ideas are generally expressed and conveyed as expectation values and deviations.

In blackjack especially, Bayes ideas applies because in a shoe, the observed outcomes in the past and present for a shoe imply the statistics of future outcomes in that shoe will have a different distribution. If I see lots of (4,5,6) dealt out, the shoe is said to become “hot”. Hence the term: “HOTSHOE”. But even then, we don’t explicitly think about Bayes when we’re counting down decks…

Amongst us, there is not much about Fisher vs. Bayes, we use what they say as long as it works, no need to form dichotomies. What works operationally is more important, and the bottom line is expectation and standard deviation.

I’ve come to accept that no one might ever be able to put forward an absolutely full proof case as to who is right or wrong regarding Darwin vs. ID, but it seems to me, the issue at a personal level can be framed on which bet is more favorable given one’s presumed (and possibly wrong) distribution and payoff function. How this bears out in professional science, I don’t have much to say, but in terms of a personal quest for truth, I wouldn’t wager on Darwin being right for many reasons, not the least of which, it seems data supporting Darwinism is cherry picked and mis-interpreted.

Astonishing to me for all the straining at gnats over ID inferences, the same standards are not applied to see the cherry picked data to support Darwinism. In effect camels are let through and obviously premature if not out rightly wrong inferences favorable to Darwinism are made.

I’ve come to accept it’s pointless to argue whether Design or Darwin are ultimately true any more than whether I can tell you what cards a dealer is going to deal. At personal level, one can take his estimates (right or wrong), and make a wager most consistent his estimates. I’m not betting on Darwin being right. At a personal level, for me, the potential payoff of Darwin being right is zero.

So to have a distribution of chilaritis, one would need multiple examples of life — multiple systems — so you could see if any are heterochiral.

If none are, then on could hypothesize that some physical lae (regularity) prevents them from forming.

On instance does not a distribution make.

KeithS,

For the record, I haven’t deleted any comments in the thread on Holy Rollers, but I can be superseded by other authors or moderators more senior than me at UD.

Though I feel we’re sometimes at each other’s throats, I’m glad you are posting in my discussions and challenging my ideas.

Best,

Sal

Sal,

I wrote:

Sal:

You’re right. Okay, let me rephrase my statement:

I’m saying that the ‘expected number of heads’ criterion is in no way the objectively ‘correct’ criterion to use as a basis for suspicion. There

isno objectively correct criterion (a point I make in my ‘paradox’ OP.)One way of seeing this is to imagine a histogram of the binomial distribution, with each sequence assigned to a bin according to the number of heads it contains. Now suppose that you randomly reassign sequences to bins while maintaining the same number of sequences per bin as before.

The bin assignments no longer have anything to do with the number of heads in each sequence. They’re random. Yet we are still suspicious if we flip a sequence that falls into one of the bins in the tails of the distribution.

The sequences themselves don’t matter. They can be arbitrarily assigned. What matters is the size of the bins containing them.

The binomial distribution also doesn’t matter. If we create one bin holding just two sequences and another bin holding the remaining 2^500 – 2 sequences, then we will be suspicious if we flip a sequence that matches one contained in the tiny bin.

There is nothing special about the ‘expected number of heads’ criterion, or the content of the sequences, or the binomial distribution.

Sal,

I understand. If my comments disappear, I won’t leap to the conclusion that you’re behind it.

But don’t take this as a golden opportunity to delete my comments with plausible deniability. 🙂

(For those who are wondering what this is about: link)

“I’m saying that the ‘expected number of heads’ criterion is in no way the objectively ‘correct’ criterion to use as a basis for suspicion. There is no objectively correct criterion (a point I make in my ‘paradox’ OP.)”

I may not necessarily agreeing with that statement, but suppose you are right. Then we can still frame the argument in terms of which hypothesis might have a better payoff, which hypothesis you’d be willing to wager on as being correct. This is sort of a utilitarian approach to truth, but in absence of really knowing, we only have educated guesses. Unless we’re dealing with math theorems, I’m naturally uncomfortable making formal inferences of absolute truth, but I do feel comfortable saying which hypothesis I’d be willing to bet is correct.

This may also make the analysis a bit more tractable. One may use various distribution functions as a working hypothesis. Maybe they are not correct, it might be good enough to make an educated wager. Operationally speaking the normal distribution is used a lot in gambling, but formally speaking it’s only an approximation of the binomial distribution, but the normal distribution often takes precedence because of its utility, not formal correctness. It’s good enough.

As far as the questions at hand, I think there might be statistical tests to actually see if Darwinian evolution is correct in terms of field observation in the present. The problem is tractability, but not testability in principle. If we cannot form tests even in principle for the present, then that is problematic for everyone. I have only an outline of ideas right now. If they ever come together I’ll publish them at UD or TSZ or both.

Simply put, Darwinism would sufficiently, but not necessarily proven if there are increases in integrated complexity in biological systems in the present

on average. If there are decreases, it doesn’t falsify Darwinism, but it would cast doubt on its claims. I think this is the right approach to trying to settle some of these issues: propose and carry out observations and experiments in the lab and field.Some of the work may already be getting done, but not under the heading of ID vs. Darwin, maybe under genetically inherited diseases, environmental conservation, endangered species, reductive evolution in microbes, etc.

Statistically rigorous methods should be used in this exploration. I’m not qualified to construct such methods, but outline some ideas I hope others will seize upon.