Creating CSI with NS

Imagine a coin-tossing game.  On each turn, players toss a fair coin 500 times.  As they do so, they record all runs of heads, so that if they toss H T T H H H T H T T H H H H T T T, they will record: 1, 3, 1, 4, representing the number of heads in each run.

At the end of each round, each player computes the product of their runs-of-heads.  The person with the highest product wins.

In addition, there is a House jackpot.  Any person whose product exceeds 1060 wins the House jackpot.

There are 2500 possible runs of coin-tosses.  However, I’m not sure exactly how many of that vast number of possible series would give a product exceeding 1060. However, if some bright mathematician can work it out for me, we can work out whether a series whose product exceeds 1060 has CSI.  My ballpark estimate says it has.

That means, clearly, that if we randomly generate many series of 500 coin-tosses, it is exceedingly unlikely, in the history of the universe, that we will get a product that exceeds 1060.

However, starting with a randomly generated population of, say 100 series, I propose to subject them to random point mutations and natural selection, whereby I will cull the 50 series with the lowest products, and produce “offspring”, with random point mutations from each of the survivors, and repeat this over many generations.

I’ve already reliably got to products exceeding 1058, but it’s possible that I may have got stuck in a local maximum.

However, before I go further: would an ID proponent like to tell me whether, if I succeed in hitting the jackpot, I have satisfactorily refuted Dembski’s case? And would a mathematician like to check the jackpot?

I’ve done it in MatLab, and will post the script below.  Sorry I don’t speak anything more geek-friendly than MatLab (well, a little Java, but MatLab is way easier for this).

529 thoughts on “Creating CSI with NS

  1. Creodont: There’s nothing stopping any of you IDists from doing all the science you want to do. Millions of dollars are spent by IDists on promoting ID and bashing evolution and the people who accept and study it, but no science is done with those millions. IDists have plenty of resources AND you have just as much access to the money that real scientists have access to, but to get any of that money you have to do science, not just gripe about science.

    A lot is contributed to science by self-educated, nature loving people who have little to no college education and have no money handed to them for any of their research and discoveries. Millions of people around the world find, document, and photograph important information about nature every day and share it with professional scientists, and the vast majority of them do it without being paid in any way. Many even pay to accompany and help professional scientists who are doing their work in the wild.

    What have you contributed to science lately, if ever? What novel research have you done lately, if ever? What grants have you applied for lately, if ever? What scientific expeditions have you been on lately, if ever? What volunteer work have you done for science lately, if ever? What papers have you submitted to scientific journals lately, if ever? What novel discoveries have you made lately, if ever? What science are you actually doing? And no, bashing science doesn’t count as doing science.

    Umm IDists are doing science. OTOH your position has nothing to do with science.

    Your position can’t be tested- it makes no testable predictions- it has nothing- and that bothers you.

  2. Elizabeth: I don’t think that was the question, William.I think the question is: do ID proponents propose that the Intelligent Designer responsible for humans designed a universe in which life would evolve spontaneously, or whether the ID intervened in his/her created universe at key points (maybe OOL, maybe the ribosome, maybe bacterial flagella) to make sure it took occurred at all and/or took its intended form.

    No intervention required if the right program was written. But then the question would be- can such a program be written?

  3. Elizabeth:
    Well, the increments in fitness are becoming rarer and rarer.Up to 3.4286e+59 now.

    Could take another week I think to hit the jackpot!

    You inspired me to dust off my GA Engine and apply it to your problem. In addition to point mutations, I use crossover between winners of tournament selection. My population size is 1000 and my mutation rate is 0.2%.

    I was excited to see the most fit individual hit 10e59 within 1000 generations. Then the slog began. I let it run over night and with just over 216,000 generations the best fitness is 9.2x10e59. I’ll leave it running for a bit longer before trying different population sizes and mutation rates.

    How many generations have you made it through thus far?

  4. Patrick:

    I let it run over night and with just over 216,000 generations the best fitness is 9.2x10e59. I’ll leave it running for a bit longer before trying different population sizes and mutation rates.

    Is the best fitness still in the population, or was that a ‘high water mark’? I’m guessing that the variances in fitness are now so small that they are behaving with effective neutrality (except for the occasional detrimental mutation)? Bigger populations would presumably allow selection to stay in the driving seat a bit longer.

  5. Elizabeth:
    Well, the increments in fitness are becoming rarer and rarer.Up to 3.4286e+59 now.

    Could take another week I think to hit the jackpot!

    I’ve coded a c++ version and hit the jackpot within the hour (10,000 generations).

    I used a population size of 100,000, and a per-nucleotide mutation rate of 0.001.

    I’m using this approach to fitness: for every new offspring created, a random potential parent (PP) is drawn, and with probability prod/maxprod the PP becomes an actual parent. prod is the PP’s product of head runs and maxprod the highest prod in the population.

    If I knew how to paste a graph in the comment box, I’d show an example run.

  6. Well, because I was storing the lineages, eventually I hit the memory buffers and the whole thing crashed, and I don’t know how many generations ran. So back to square one :(

    I’m running it again, this time without storing the lineages, and I’m running two versions in parallel – one same as before, one with a stochastic selection process – instead of selecting the 50 with the highest product, I select 50 with probability of selection weighted by how high the product is.

    I’m also saving the results every 500 generations in case it crashes again!

    I think you’ll hit the jackpot before I do. What would be really cool would be to get to HHHHT * 100

  7. But that wouldn’t be a question that science could resolve, I don’t think – or do you think it could?

    Ultimately, science doesn’t resolve anything anyway, IMO.

  8. But that wouldn’t be a question that science could resolve, I don’t think – or do you think it could?

    Ultimately, science doesn’t resolve anything anyway, IMO.

    (fixed)

  9. Ido: I’ve coded a c++ version and hit the jackpot within the hour (10,000 generations).

    I used a population size of 100,000, and a per-nucleotide mutation rate of 0.001.

    I’m using this approach to fitness: for every new offspring created, a random potential parent (PP) is drawn, and with probability prod/maxprod the PP becomes an actual parent. prod is the PP’s product of head runs and maxprod the highest prod in the population.

    If I knew how to paste a graph in the comment box, I’d show an example run.

    If you paste in the URL of the image, I’ll put the code in to make it show up. Not sure it works for peons.

  10. William J. Murray: Ultimately, science doesn’t resolve anything anyway, IMO.

    (fixed)

    Well, I disagree, but, more to the point, so do most ID proponents! Would you say that ID science claims are meaningless, then?

  11. Elizabeth: Well, I disagree, but, more to the point, so do most ID proponents!Would you say that ID science claims are meaningless, then?

    IDists agree that our claims could be meaningless. All it would take to demonstrate our claims are meaningless is to A) demonstrate an understanding of our claims and B) refute them, basically by just providing positive evidence for materialism.

    Good luck with that…

  12. I think there are very few ID proponents that would disagree with me on that one. Ultimately, what resolves things – towards any conclusion – is one’s deep worldview, not data.

  13. Here at my Google docs is an example run.

    The individuals with highest fitness have 44 HHHT 59 HHHHT and 5 HHHHHT. That seems to be a local maximum where the population gets stuck.

    The left panel shows the population mean product and maximum product. The right panel the frequencies of nH (n=2..6) over time, plus the frequency of multiple-zero (mz) “spacers” between nH’s, which quickly drops to zero.

  14. William J. Murray:
    I think there are very few ID proponents that would disagree with me on that one. Ultimately, what resolves things – towards any conclusion -is one’s deep worldview, not data.

    But surely all those ID proponents lamenting that ID is not taken seriously as science – who claim that the evidence indicates a designer, would disagree with you?

    From the UD FAQ:

    ID Defined

    The theory of intelligent design (ID) holds that certain features of the universe and of living things are best explained by an intelligent cause rather than an undirected process such as natural selection. ID is thus a scientific disagreement with the core claim of evolutionary theory that the apparent design of living systems is an illusion.

    In a broader sense, Intelligent Design is simply the science of design detection — how to recognize patterns arranged by an intelligent cause for a purpose. Design detection is used in a number of scientific fields, including anthropology, forensic sciences that seek to explain the cause of events such as a death or fire, cryptanalysis and the search for extraterrestrial intelligence (SETI). An inference that certain biological information may be the product of an intelligent cause can be tested or evaluated in the same manner as scientists daily test for design in other sciences.

    ID is controversial because of the implications of its evidence, rather than the significant weight of its evidence. ID proponents believe science should be conducted objectively, without regard to the implications of its findings. This is particularly necessary in origins science because of its historical (and thus very subjective) nature, and because it is a science that unavoidably impacts religion.

    Positive evidence of design in living systems consists of the semantic, meaningful or functional nature of biological information, the lack of any known law that can explain the sequence of symbols that carry the “messages,” and statistical and experimental evidence that tends to rule out chance as a plausible explanation. Other evidence challenges the adequacy of natural or material causes to explain both the origin and diversity of life.

  15. Ido:

    The individuals with highest fitness have 44 HHHT 59 HHHHT and 5 HHHHHT. That seems to be a local maximum where the population gets stuck.

    Crying out for inversion, gene conversion, an alignment mutation and/or a transposon!

  16. Elizabeth,

    How about a post about the dangers to education, in the event of the election of ID friendly politicians?

    Mike Elzinga: “And that is the dramatic difference between science and ID/creationist pseudoscience. Science converges and produces definitive results; ID/creationist pseudoscience mud-wrestles, feigns and evades endlessly.”

  17. Allan Miller:
    Ido:

    Crying out for inversion, gene conversion, an alignment mutation and/or a transposon!

    Yes, that would be cool! If I find the time I’ll try and add some of those options to the simulation. Especially the transposon would be seriously cool, although it would slow things down considerably I suspect.

  18. Eric Anderson writes:

    Good luck. Elizabeth is convinced that evolutionary algorithms create something from nothing, the informational equivalent of perpetual motion. Incidentally, I’m not sure why she is still trying to demonstrate this, given the passionate arguments made that evolutionary algorithms such as Avida have already demonstrated this. Or maybe she finally realized they haven’t demonstrated any such thing.

    Well, no, Eric. I’m trying to do two things: first is to take an example that is simple enough that we can readily compute the CSI using Dembski’s formula, and demonstrate that it is increased (i.e. generated by) Natural selection, and that it reaches the threshold at which he claims we must reject “no design”; and second is to make it simple enough that even a non-mathematician can follow it.

    AVIDA has certainly demonstrated the same thing, and it is not the equivalent of “perpetual motion” unless you buy the argument that Information is Conserved, which is precisely the claim at issue!

    Anyway, if you are reading this: would be delighted to see you over here. There is already a thread here in your honour.

  19. Ido: Ido:
    Crying out for inversion, gene conversion, an alignment mutation and/or a transposon!
    Yes, that would be cool! If I find the time I’ll try and add some of those options to the simulation. Especially the transposon would be seriously cool, although it would slow things down considerably I suspect.

    One of the problems with this kind of modeling is that, although there is the analog of a physical quantity that is being maximized, there is not anything that makes use of a gradient in the vicinity of that maximum. That is not a realistic portrayal of how nature works. It is less restrictive.

    There is another issue as well. Although one is selecting the fittest at each step, the fittest go back into the same chaotic environment with only a slight probability of producing offspring from themselves that are more fit.

    The analog to this would be forming molecules in an extremely energetic environment where they get torn apart as fast as they form. In reality, that is not how things happen. Complex molecules that form may indeed get torn apart immediately, but there are also those that get transported out of that energetic environment before they are destroyed.

    They end up in a less energetic environment where they “anneal” to become stable molecules even though they could not have been formed in that less energetic environment.

    The remarkable thing about these little demonstrations is the fact that, despite the harsh environment in which these fittest results are being selected, they still manage to produce something slightly more fit as they approach the peak of fitness.

    So the demonstration, though not being as restrictive as nature would be, is nevertheless producing results contrary to what ID/creationists say about nature.

  20. Allan Miller,

    That one is still in the population. I also thought that perhaps it had taken over, but I’ve seen jumps in fitness after several 10s of thousands of generations, from 7x10e59 to 8x10e59 to the current best.

    I may try a parallel run with a population of 5 or 10 thousand.

  21. William J Murray: “ID was the default perspective of virtually everyone conducting science of any kind prior to about the 1950’s ( a few isolated cases notwithstanding). IDists invented modern science. Non-IDists are the johnny-come-lately’s that have, apparently, entirely forgotten history.”

    You’re confusing the work done, (i.e. science), with the worker, (i.e. scientist).

    Real scientists took off their theistic robes when they put on their lab coats.

    They didn’t allow one world to interfere with another.

    That’s what needs to be done in education.

    There has to be a separation of church and science.

  22. And to pull us back to the OP….

    This little exercise here is a good example of a model being tested. I set out to falsify a very specific hypothesis of Dembski’s – he claims only intelligent agents can generate CSI of chi>1. I am attempting to demonstrate that using only Darwinian processes, we can raise the CSI of randomly generated sequences of 500 coin tosses from what is accessible by chance (because that’s how I generated the starting population!) to what is, according to Dembski, inaccessible. Yet all I am doing is randomly mutating my sequences (each locus in the sequence has an equal chance of being flipped the other way in any given offspring), and letting the sequences with the highest product-of-runs-of-heads have the best chance of reproducing, exactly as Darwin proposed.

  23. Mike Elzinga: One of the problems with this kind of modeling is that, although there is the analog of a physical quantity that is being maximized, there is not anything that makes use of a gradient in the vicinity of that maximum.That is not a realistic portrayal of how nature works.It is less restrictive.

    I’m not sure I understand what you mean, Mike. Fitness landscapes are not necessarily smooth, so I don’t see why a more realistic model should include a gradient. It seems to me that the more unrealistic idealized models make use of gradients, such as the Shashahani gradient in replicator dynamics.

  24. Well, I’m not an American! But if an American would like to write one, cool.

  25. Ido: I’m not sure I understand what you mean, Mike. Fitness landscapes are not necessarily smooth, so I don’t see why a more realistic model should include a gradient.

    It goes back to the fundamentals of physics and chemistry. If it is a physical phenomenon, especially if it is something involving complexity, then it has a distribution. If it has a distribution, it is a phenomenon in the vicinity of a “potential well” that is smooth.

    Square wells are approximations of situations that are locked. You are either in the well or you are not. Frozen crystals (atoms or molecules locked tightly together) could be approximated with such a square well as long as one is not dealing with crystals near their melting temperatures.

    The soft matter chemistry of living organisms is concerned with compounds near their melting temperatures. On the other hand, relative to this, the genes and other molecules associated with DNA are chemical bonds on the order of ten times greater.

    This may be what you are thinking about; those changes are more like square well changes relative to other changes that could take place in an organism that respond to selection. (Fitness landscapes are the negative of a landscape of wells).

    I realize that biologists don’t normally think in terms of potential wells because fitness is something that makes more sense if it increases. But ultimately it comes down to systems finding stability in a local minimum; and in reality, especially for complex systems, the wells are smooth.

    It is remarkable, however, that selection works as well as it does in these little programs in which the current fittest takes its chances in the same environment. What the programs are demonstrating is more the improbability of changing everything that is already set before something better is produced. That is a property of combinatorics without the effects of physics and chemistry. Nature is generally a bit more restrictive.

  26. Good point. But the great thing about Bayesian analysis is that you have to make your priors explicit. So your conclusions come with your priors tagged to them.

    Or should.

  27. I realize that biologists don’t normally think in terms of potential wells because fitness is something that makes more sense if it increases.

    .

    I prefer an inverted metaphor myself – although ‘hill-climbing’ is so embedded in the biological and GA worlds generally. But (bearing in mind that either way up it is only a metaphor) the idea of populations wandering about some plane***, avoiding walls that block paths, rolled off hills of detriment or ‘caught’ by the pull of a nearby well (with gravitation taking the role of differential offspring gradients) seems more intuitive.

    *** and, indeed, the surface moving about under the population.

  28. Mike Elzinga: It goes back to the fundamentals of physics and chemistry.If it is a physical phenomenon, especially if it is something involving complexity, then it has a distribution.If it has a distribution, it is a phenomenon in the vicinity of a “potential well” that is smooth.

    I think I see what you mean now. You’re thinking in terms of “ensembles”, where averaged over many “replicates” of the same stochastic process (such as our simple simulations), the distribution becomes smooth.

  29. Allan Miller: .

    I prefer an inverted metaphor myself – although ‘hill-climbing’ is so embedded in the biological and GA worlds generally. But (bearing in mind that either way up it is only a metaphor) the idea of populations wandering about some plane***, avoiding walls that block paths, rolled off hills of detriment or ‘caught’ by the pull of a nearby well (with gravitation taking the role of differential offspring gradients) seems more intuitive.

    *** and, indeed, the surface moving about under the population.

    Oh, me too! It’s so much easier to imaging things rolling down hill, getting stuck in a shallow basin, occasionally falling off a cliff, having to negotiate an obstructive ridge….

  30. Elizabeth:
    And to pull us back to the OP….

    This little exercise here is a good example of a model being tested.I set out to falsify a very specific hypothesis of Dembski’s – he claims only intelligent agents can generate CSI of chi>1.I am attempting to demonstrate that using only Darwinian processes, we can raise the CSI of randomly generated sequences of 500 coin tosses from what is accessible by chance (because that’s how I generated the starting population!) to what is, according to Dembski, inaccessible.Yet all I am doing is randomly mutating my sequences (each locus in the sequence has an equal chance of being flipped the other way in any given offspring), and letting the sequences with the highest product-of-runs-of-heads have the best chance of reproducing, exactly as Darwin proposed.

    As I said I await your paper in a peer-reviewed journal.

    Or you could just send it to Dembski, I believe his email is on his website (not UD)

    As I told you if you are using replicating entities then you are starting with the specified complexity that needs to be explained- just read the quote Dembski provided of what Dawkins said:

    Thus, in The Blind Watchmaker Dawkins writes, “Complicated things have some quality, specifiable in advance, that is highly unlikely to have been acquired by random chance alone. In the case of living things, the quality is specified in advance is…the ability to propagate genes in reproduction.”

    And if reproduction didn’t arise in some materialistic fashion then darwin doesn’t apply.

  31. Flint: By Jove, I think you’ve put your finger on the heart of the issue here. Would we rather be happily deluded, or would we rather be more nearly correct even when the truth hurts? I think most people opt for the former, whether they articulate it or simply “feel” that way. One need only intone “Am Not! I’m being scientific!” and POOF, the problems vanish. Poof theory wins again!

    :-)

    Hence the development of the bumptious creationist pseudoscience that then put on the cheap tuxedo and became ID.

  32. Ido: I think I see what you mean now. You’re thinking in terms of “ensembles”, where averaged over many “replicates” of the same stochastic process (such as our simple simulations), the distribution becomes smooth.

    I wasn’t paying attention and missed this.

    Yes, this gets at the general idea. But I should also point out that the distributions apply to features of the phenotype undergoing selection as well as to the “forces” in the selection processes themselves. We don’t normally model the latter; but we could.

    So these are overlapping “fuzzy” distributions; and that contributes to the smoothing as well. By not modeling the latter, we are essentially folding all the distribution into the phenotype.

    It doesn’t change the result very much.

  33. And by the way, isn’t this off the topic of this thread? Were you attempting to seque the thread onto your favorite topic?

    I don’t believe we’ve even broached the subject of my favorite topic in these forums, but if you can direct me to the Guild Wars 2 thread, I’ll be in there like a grizzly bear!!!!

  34. Mike E,

    I love science. And logic. And my world view doesn’t require any justification, because I believe whatever I wish. So big ol’ teddy bear science and magic get along just fine in my world :)

    You know what they say … any display of sufficiently advanced science is indistinguishable from magic anyway.

    And, ironically enough, justified is my favorite show! Go BOYD CROWDER!! Gotta love ‘im

  35. William J. Murray: And, ironically enough, justified is my favorite show! Go BOYD CROWDER!! Gotta love ‘im

    Well, he isn’t posting comments here. And you can type “Guild Wars 2” into a Google search and find some gamers to play with.

    I don’t happen to watch a lot of TV or get involved in video games. Life is plenty rich and full already.

  36. Anyone mind if I move the above into William’s thread? It seems more to do with that than with this OP!

  37. Hmm, seems like a comment from WJM just vanished; he asked whether Elizabeth had produced any CSI yet; Well, here is the answer anyway:

    Yes. See this comment, and many following updates on the product increasing further:

    Elizabeth:
    Well, my best product has now reached 8.0828e+58.

    But it clearly works.That number is almost certainly high enough for the proportion of sequences that high or higher to be small enough to give a chi >1

    So Dembski is falsified, unless someone can tell me how my calculation of chi is wrong.I calculated it as -log2[10160*N/2500] where N was my guesstimate of the number of sequences with products higher than my threshold. Actually N can be far bigger than my estimate, and it still makes it by miles.

  38. But I mistyped – it should have been 10^120, not 10^160.

    So, William, what do you think?

  39. I think I”ve already told you what I think about your efforts as per this thread, but that shouldn’t stop you from celebrating :) What’s important is that you believe you’ve proven Dembski wrong.

  40. William J. Murray:
    I think I”ve already told you what I think about your efforts as per this thread, but that shouldn’t stop you from celebrating What’s important is that you believe you’ve proven Dembski wrong.

    Yes, because for WJM, math and science are just belief-systems!

  41. …. but you could have saved yourself so much time and effort simply by adopting the worldview that you had proven DrDrD wrong!

    The funny thing is, that’s pretty much what happened.

  42. Actually data can often give you ideas about how to generate a hypothesis, and, indeed, how to collect it.
    -Elizabeth

    About thirty years ago there was much talk that geologists ought only to observe and not theorize; and I well remember someone saying that at this rate a man might as well go into a gravel-pit and count the pebbles and describe the colours. How odd it is that anyone should not see that all observation must be for or against some view if it is to be of any service!
    – guess who

Leave a Reply