Evolution trounces Intelligent Design with real data.

Posted on September 4, 2013 by Richardthughes

Man has created many sophisticated modelling tools, all of which have different strengths and weaknesses. A good ‘solution’ or ‘description’ of a problem should not be overly complex (parsimony) and also have high descriptive power. Let’s look at how some of these cutting edge tools compare to Nutonian‘s symbolic regression / evolutionary computation product Eureqa:

from: http://blog.nutonian.com/bid/330675/How-does-Eureqa-Compare-to-Other-Machine-Learning-Methods#!

Looks like we get highly efficient yet incredibly accurate models. Huzzah! Don’t believe me? try it yourself. They “ran Eureqa on seven test-cases for which data is publically available, and compared performance to four standard machine learning methods. The implementations used were the WEKA codes, with settings optimized for best performance”. Weka is free and the Eureqa trial version is free.

Enjoy the full post here

Unfortunately I think Hod’s interpretation of NFL isn’t quite there, though. 😉

40 thoughts on “Evolution trounces Intelligent Design with real data.”

Lizzie on September 4, 2013 at 10:17 am said:

Very interesting. Thanks. I need to check that out.
Alan Fox on September 4, 2013 at 10:29 am said:

I was waitng for someone else to comment to see if they spotted the spoof. I can’t tell from Lizzie’s comment whether that is subtle irony or…
Lizzie on September 4, 2013 at 3:23 pm said:

This is pretty awesome.

Using it on some data now.

Very cool.
Lizzie on September 4, 2013 at 3:24 pm said:

Alan Fox:
I was waitng for someone else to comment to see if they spotted the spoof. I can’t tell from Lizzie’s comment whether that is subtle irony or…

oh. What spoof?
petrushka on September 4, 2013 at 4:26 pm said:

I suppose it isn’t cool simply to say I don’t have a clue.
Richardthughes on September 4, 2013 at 4:51 pm said:

Okay – some background.

Imagine a complicated equation that describes someting. It has a structure whereby the different parts interact in different ways to do different things. It can be long or short. Parts of it may add high or low value. This is very similar to genetics, no?

Eureqa evolves and breeds solutions to problems using genetic structures for equations and culls bad ones based on fitness – how well the equation explains known data.
Richardthughes on September 4, 2013 at 4:52 pm said:

Ping me offline if you want some help or ideas!
Patrick on September 4, 2013 at 5:43 pm said:

Is this similar to Koza’s GP approach?
petrushka on September 4, 2013 at 6:21 pm said:

I’ve heard of Eureqa, but haven’t the background to fully understand or appreciate it.

Finding equations to fit datasets sounds pretty cool. It’s a bit like an automatic scientific law discoverer.

It would underscore two facets of science that are seldom discussed. One is that science is imaginative and inventive — not simply mapping — and the second is that “laws” are limited in scope and provisional.
Richardthughes on September 4, 2013 at 6:24 pm said:

Patrick:
Is this similar to Koza’s GP approach?

Yessir: http://en.wikipedia.org/wiki/Genetic_programming
Richardthughes on September 4, 2013 at 6:29 pm said:

Bonus content: Enjoy the video!

http://phys.org/news179394947.html
Lizzie on September 4, 2013 at 6:37 pm said:

Richardthughes:
Ping me offline if you want some help or ideas!

Thanks, I will. We have been using SVMs but I’ve been pushing for this in the lab.
Alan Fox on September 4, 2013 at 7:19 pm said:

I’m deeply ashamed to admit I followed Rich’s link and, as all the jargon completely mesmerised me, I thought it was a Sokal-style spoof.

I shall now follow in the footsteps of Abu Hasan.
RBH on September 4, 2013 at 11:43 pm said:

Hmmmm. I think we’ll take a look at this as it compares (competes?) with our prop time series modeling technology (GA-evolvable + perceptron-ish memory + secret sauce).
Lizzie on September 4, 2013 at 11:48 pm said:

I was wondering if you’d come across this, RBH 🙂
Mung on September 5, 2013 at 1:24 am said:

Intelligently Designed computer programs. HUZZAH!

For those who haven’t, in their faux skepticism, actually bothered to read the post you linked to in your OP, there is no mention of evolution. Or Intelligent Design.

Nor is there any pretense to having solved the no free lunch problem. Here’s a direct quote: “There is no free lunch. ”

Nor are the data sets used biological (feel free to correct me).

But the method, now that’s interesting: Symbolic regression

What’s your point, really?
Mung on September 5, 2013 at 1:26 am said:

Oh, I get it, all those other machine learning algorithms were not using “evolution!”

Color me dense.

But then, they weren’t using “intelligent design,” either. What’s your point?
Richardthughes on September 5, 2013 at 1:43 am said:

So SVMs are naturally occuring?

The joys of watching a driveby in a cul-de-sac.
Richardthughes on September 5, 2013 at 2:16 am said:

This might help you out a bit, Mung: http://www.mafy.lut.fi/EcmiNL/older/ecmi35/node70.html
Mung on September 5, 2013 at 2:35 am said:

Explain again how symbolic regression excludes the alternatives?

The method of symbolic regression described here are of three kind. The first one, genetic programming, is the oldiest one and can be used only by genetic algorithms and Lisp programme language. The second one, grammar evolution is “unfolding” of a genetic programming so that instead of Lisp there can be used different computer programme languages. Hovewer, grammar evolution still use binary representation of individuals and crossover operations like genetic algorithms use. The third method, called analytic programming, is independent on programme language and can be used by any evolutionary algorithm, does not matter how new offspring are calculated.
It can be stated that all three algorithms can be used for symbolic regression tasks

Thanks for the link, but it hardly clarifies whatever point it is you’re attempting to make.

Symbolic regression is in fact based on existence of so called evolutionary algorithms.

A better EA than EA’s?

Term “symbolic regression” (SR) represents process during which are measured data fitted by suitable mathematical formula … This process is amongst mathematician quite well known and used when some data of unknown process are obtained.

Data of unknown process. So your point here is that SR can distinguish design from evolution?

Color me stupid.
Mung on September 5, 2013 at 4:33 am said:

Richardthughes: So SVMs are naturally occuring?

You’re pretending to respond to an actual statement made by someone in this thread?
Richardthughes on September 5, 2013 at 8:10 am said:

To help Mung and Chubs who is having trouble over on his own blog:

Man is quite a clever problem solver. We use pattern recognition, intelligence and exogenous information to design these classes of models. They do okay, but apparently (emperically) not as well as GAs, which without our intelligence (or hubris or preconceptions) find novel ways to build explanatory frameworks. Consider this: https://www.youtube.com/watch?v=MSo6eeDsFlE

Would you have got there through experimental data? (yes Joe, we know you would have, using choo-choo math).

Oh, and Joe, pick a point of ignorance and stick to it. If you think evolution is okay but was initially configured, don’t be upset with Lenski or Tiktaalik’s location. If you despise all things evolution (and you do, poster child for ID is simply upset with evolution) then also be upset with Eureqa’s outperformance.

Thanks guys.
Mung on September 5, 2013 at 8:16 am said:

Richardthughes:

Man is quite a clever problem solver. We use pattern recognition, intelligence and exogenous information to design these classes of models.

so?
Richardthughes on September 5, 2013 at 8:19 am said:

Mung,

So these solutions are designed by an intelligence.
Lizzie on September 5, 2013 at 9:10 am said:

Well, Mung, of the methods mentioned in the OP, the only one I haven’t come across is “regression trees”. Of the remainder, Eureqa is the most similar to evolution, although they all have some similarities. Even linear regression, if you use a Maximum Likelihood Estimation for finding the best fit, rather than Ordinary Least Squares, involves an iterative optimisation method that includes randomly chosen values. The others all involve iterating generations of solutions, with random variation, until the best fit is found.

So I think the point of the OP is that the more similar a fitting algorithm is to real-life evolution, the better it is at finding solutions that are both simple and accurate, the downside being that it takes far more time (because many many generations are needed).

I’ve just been running Eureqa on some brain imaging data, to find an equation that distinguishes between two groups of people, using the time courses of their brain oscillations. We’ve been using SVMs for this in the past, but this looks potentially much more promising, as it allows for non-linear equations, and if…then statements.

So whatever the implications for the smartness of evolution over intelligent design, it looks like a pretty neat tool.

The reason it more closely resembles real-life evolution than the others is that it is far higher-dimensioned – there are far more ways in which it can vary. It also uses sexual reproduction, which makes it more efficient (as it does in real life, allowing even slowly-reproducing organisms to evolve efficiently).

Of course the algorithm itself is “intelligently designed” – computer programs don’t write themselves. But that is a quite different question, and raises the question of whether an evolutionary system could come about naturally, as opposed to the question of whether evolutionary systems can find solutions to problems that that an intelligent designer can’t (without using an evolutionary system).

Which itself raises another question: human designers need to use evolutionary algorithms to solve very complex problems. If they were to attempt to design a living thing, I suggest they would use an evolutionary algorithm. So any argument for ID that is based on analogy with human intelligence (and most are) needs to address the question: would the putative Designer have to use an evolutionary algorithm to do it? Because we have no evidence that human designers can do tackle such a task without using evolutionary algorithms.

Iterative fitting, with random variation along a large number of dimensions, keeping the best and discarding the worst, is an extraordinarily creative process. Watching Eureqa find equations for my data is mesmerising.
JonF on September 5, 2013 at 12:33 pm said:

Color me stupid.

OK.
hotshoe on September 5, 2013 at 2:34 pm said:

Now I understand – I did not get the point of Eureca from the OP.

Thanks!
Lizzie on September 5, 2013 at 5:52 pm said:

What is also fascinating is seeing punk eek at work. The algorithm toddles along in a nice little niche, not change much over many thousands of generations, because most variants are worse than the best of the current variants, then suddenly – dramatic change, and a quite different critter starts to dominate the population.

And what is also neat that often that new critter is quite clumsy, with a lot of parameters, but as parameters are penalised, it very rapidly slims down. So I might get a new successful equation with, say 20 parameters. I look again in half an hour, and I have a similarly fit equation with only 14, or even twelve.
Richardthughes on September 5, 2013 at 8:14 pm said:

If you have a big problem, their cloud solution scales really well (basically a re-sell of the amazon cloud). It’s just a function of cores and time. Also consider limiting the functions that are ‘genomically available’.
Lizzie on September 5, 2013 at 9:07 pm said:

I’m not sure that I understand how it deals with time series (which are what I’ve got). I’m just playing right now, but I’d like it to be able to fit functions to time-series for a set of independent series.

I’ll ping you 🙂
Mung on September 9, 2013 at 1:06 am said:

Elizabeth:

So I think the point of the OP is that the more similar a fitting algorithm is to real-life evolution, the better it is at finding solutions that are both simple and accurate, the downside being that it takes far more time (because many many generations are needed).

How do we determine how similar a “fitting algorithm” is to “real-life evolution”?

…the better it is at finding solutions that are both simple and accurate…

You’re implying that evolution finds “solutions” that are both simple and accurate. How do you know this?

… the downside being that it takes far more time (because many many generations are needed).

And evolution doesn’t?
Richardthughes on September 9, 2013 at 1:44 am said:

Eureqa returns a Pareto frontier that trades off complexity for explanatory power. Complexity is a bit like genomic length (although some expressions count for more) and explanatory power is any if the standard accuracy measures (AIC does well)
Mung on September 9, 2013 at 3:12 am said:

Richardthughes:
Eureqa returns a Pareto frontier that trades off complexity for explanatory power.

GREAT!

How does it know what “complexity” is, and what “explanatory power” is and what algorithm does it use to calculate the “tradeoff”?

Keep in mind, of course, that this a better ID than ID.
Richardthughes on September 9, 2013 at 6:34 am said:

Mung,

seriously?

“Complexity is a bit like genomic length (although some expressions count for more) and explanatory power is any if the standard accuracy measures (AIC does well)”

complexity = length (as for as you’re concerned) and accuracy = how will it fits existing data.
Richardthughes on September 9, 2013 at 7:38 am said:

More to bring Mung up to speed:

http://en.wikipedia.org/wiki/Akaike_information_criterion
OMagain on September 9, 2013 at 8:53 am said:

Mung: How does it know what “complexity” is, and what “explanatory power”

It looks at the output of ID and says “it’s not that”.
Lizzie on September 9, 2013 at 9:52 am said:

Mung:
Elizabeth:

How do we determine how similar a “fitting algorithm” is to “real-life evolution”?

Remember, Mung, that we are comparing a model of biological optimisation (namely the evolutionary hypothesis) with that model applied to an actual optimisation problem. We know our algorithm resembles the biological model because we can look and see that they are the same model.

The point is that the algorithm does optimise very efficiently. Therefore the charge that it couldn’t, in a biological context, doesn’t fly.

You’re implying that evolution finds “solutions” that are both simple and accurate. How do you know this?

I’m “implying” – actually reporting – that the evolutionary algorithm finds solutions that are both simple and accurate, because I can look at the solutions, and find that they are both simple and accurate – simple, in that they use a very small number of data and parameters, and accurate, because they result in very good predictions, as measured by any number of goodness-of-fit measures, e.g. an R^2.

And evolution doesn’t?

Not sure what you mean. Evolutionary algorithms require a large number of generations to optimise.

I think at least one of us is confused here.
Mung on September 16, 2013 at 12:59 am said:

Remember Elizabeth, we are working with data and models. I don’t remember seeing any biological data in the dataset. Maybe I missed it.
Richardthughes on September 23, 2013 at 6:46 pm said:

No-one claimed we were using “biological data”, we make claims about the efficacy of evolutionary methods.
petrushka on September 23, 2013 at 6:54 pm said:

More specifically we are commenting of Dembski’s mathematical model.

Remember those claims about “information”? About semiosis and such?

When you claim that the behavior of chemistry can be abstracted, you open the door to abstract models. Those models can be tested mathematically.

They do not prove that chemistry works that way, but they prove that to the extent evolution can be abstracted as information, it works.

The end of all ID claims involving probability.