There’s been a skirmish between Larry Moran and Barry Arrington about whether Barry understands the Theory of Evolution, and the latest salvo is a piece at UD, entitled, Can a Lowly Lawyer Make a Useful Contribution? Maybe.
Well, in a sense, Barry makes a useful contribution in that post, as he gives a very nice illustration of a common misunderstanding about the process of hypothesis testing, in this case, basic model-fitting and null hypothesis testing, the workhorse (with all its faults) of scientific research. Barry writes:
[Philip]Johnson is saying that attorneys are trained to detect baloney. And that training is very helpful in the evolution debate, because that debate is chock-full of faulty logic (especially circular reasoning), abuse of language (especially equivocations), assumptions masquerading as facts, unexamined premises, etc. etc.
Consider, to take one example of many, cladistics. It does not take a genius to know that cladistic techniques do not establish common descent; rather they assume it. But I bet if one asked, 9 out of 10 materialist evolutionists, even the trained scientists among them, would tell you that cladistics is powerful evidence for common descent. As Johnson argues, a lawyer’s training may help him understand when faulty arguments are being made, sometimes even better than those with a far superior grasp of the technical aspects of the field. This is not to say that common descent is necessarily false; only cladistics does not establish the matter one way or the other.
In summary, I am trained to evaluate arguments by stripping them down to examine the meaning of the terms used, exposing the underlying assumptions, and following the logic (or, as is often the case, exposing the lack of logic). And I think I do a pretty fair job of that, both in my legal practice and here at UD.
Barry has made two common errors here. First he has confused the assumption of common descent with the conclusion of common descent, and thus detected circular reasoning where there is none. Secondly he has confused the process of fitting a model with the broader concept of a hypothesised model.
To take a simpler example, but one that is directly analogous. Suppose we hypothesis a correlation between, say, Body Mass Index and lifespan. We theorise that the higher your body mass, the more vulnerable you are to various pathological process, and therefore the more likely you are to die young – in other words we hypothesis a broadly monotonic relationship betwen BMI and age at death. The traditional way of doing this is to get a sample data, find the best-fit line for BMI against age at death, and test the null hypothesis that the “best fit line” in the population from which we drew our sample would have a slope of zero. It’s a stupid null, in fact, because the null, expressed thus, is always false – no slope is ever exactly zero. But because data are noisy (skinny people die young for lots of reasons and sometimes obese people live to a ripe old age, also for lots of reasons) it’s still a good test value – if we can show that our actual fitted line, under the null, is unlikely to to be found in a sample drawn from a population that has a slope of zero, then we can reject the null that there is no linear relationship in the population, even though we may be not entitled to reject a different null that there is a smaller one than the one our data seem to indicate.
My first point being that we start off by making a prediction, based on theory. We then propose a model, based on that theory: that there will be a roughly linear relationship between BMI and age-at-death. Choosing to fit a linear model does not entail the assumption that there is a linear relationship, because we actually test the null that there is NO linear relationship. The analogy here with cladistics is: choosing to fit a tree model does not entail the assumption that a tree model will fit. What is tested is the null of “no tree”.
Going back to the linear slope example: as our null is that the true best-fit line in the is zero, a slope of more than a certain slope-iness, whether positive or negative, will require us to reject that null, even though our theory predicts a negative slope (higher BMI associated with lower age at death). So a confident rejection of our null doesn not necessarily support our theory, nor accord with our prediction – it allows us to get a surprise: maybe all those fries and donuts are actually helping us live longer! This is why, while we may privately (or publicly) predict a negative slope, because that is what our theory leads us to expect if true, we have to allow for a fitted slope that not only indicates that our null is false but that our theory is false too.
So my second point is that when a palaeontologist fits a tree model to her data, she is a) testing the null hypothesis that the data are not distributed as a tree (just as when I fit a linear model to the BMI/age at death data, I am testing the null that there is no linear slope), but she is also finding out what the tree, if there is one, actually is, and may be surprised (just as when I fit a linear model to the BMI/age at deat data, I may find my fitted model is positive, not negative).
Of course palaeontologists aren’t seriously testing the null hypothesis that the data are distributed as a tree – we know, from countless cladistics studies that they are, and it isn’t even disputed by anyone. It’s what Linnaeus found, it’s what YECs need to get their animals on to the ark, and the only fundamental disputes are whether there is one tree or several. So a tree is a good basic model, and is the workhorse model for palaentology just as the linear model underlies most research in social and health sciences (indeed, it’s called the General Linear Model, GLM to its friends) Nonetheless, we need to be aware that there are potential other models (non-linear; non-tree) that might fit better. But trees fit pretty damn well for palaeontological data.
They are, instead, interested in the model fit. Just as by now we don’t really have to test the basic hypothesis that BMI is negatively correlated with health, but rather want to find out HOW much it correlates with different aspects of health, and just how many years an extra donut will knock of our expected span. And, directly analagously, that’s what the palaeontologist does – she wants to know whether the best fitting model supports this, or that, idea about which organisms branched off from which other organisms, when.
So, Barry: yes, you’ve helped, by articulating precisely the problem so many people have in understanding how quantitative models are tested in science. So let me summarise:
- The reasoning is not circular, because we test the null, not the hypothesis – we do not assume our hypothesis is true, rather we figure out how likely our data are to be as extreme as they are if our hypothesis is false. Dembski is good on this, btw.
- There is a different between choosing a model and fitting a model. We choose a tree, or a slope,model because we want to test the null that there is no tree or no slope. However, when we FIT the model, we make no prior assumptions about the degree or direction of fit. And we are often much more interested in the parameters of the fit (which tree, what slope) than whether there is in fact a tree/slope. That’s important, but it’s neither assumed nor imposed.