Math Genome Fun

Posted on January 6, 2016 by Richardthughes

You have thirteen characters: the numbers 1-9 and the operators for Plus, Minus, Divide and Multiply. Arrange them in a string so they have the highest possible value, or write a program to do this for you. If a string cannot execute as a mathematical function, it scores zero.

Full list [1,2,3,4,5,6,7,8,9,+,-,*,/]

There are 13! possibilities. How many score more than 0 (are mathematically viable)? How many steps did you program take / candidates did it evaluate. How did you know when to stop?

207 thoughts on “Math Genome Fun”

keiths on January 7, 2016 at 7:30 pm said:

Pure transposition.

Yes.
dazz on January 7, 2016 at 8:53 pm said:

How frustrating. I have failed miserably at this threaded version of the brute force algo . I’m relatively new to python and had no idea coding concurrent stuff was so tricky.

Oh well, anyway, I left the single thread script running at least I can compare the results with yours.
dazz on January 7, 2016 at 10:08 pm said:

I’m not logging all valid sequences, just the ones that evaluate higher than the last best, but it certainly looks like there are islands of fitness because it can take a while in between hits, and when it hits a new maximum, it’s a bunch of them in a row, or very close together. Makes sense
DiEb on January 7, 2016 at 10:18 pm said:

keiths,

I tried the simplest algorithm EA(1+1,1): one parent, one mutated child, better one is kept:

1) using just a single transposition doesn’t work, as you get easily stuck with 9*87654 as the product part

2) allowing for up to four elements to permute worked surprisingly well – I reached the Maximum in 10% of the simulations in less then 1000 generations…
Richardthughes on January 7, 2016 at 10:34 pm said:

DiEb,

How big a population ?
DiEb on January 7, 2016 at 10:36 pm said:

Richardthughes,

One parent, one child…
Richardthughes on January 7, 2016 at 10:37 pm said:

DiEb,

Fuuuuuuu……….
DiEb on January 7, 2016 at 11:06 pm said:

1) I stopped the evaluation after 1,000 unchanged generations
2) I performed the GA (written in R) 1,000 times
3) it took 2′ 37” – much faster than brute force
4) on average, the algorithms halted after 2478 generations – i.e., on average, the local maximum was reached within 1478 generations
5) in 169 out of 1,000 evaluations, the true maximum was found
6) in these 169 evaluations, the maximum of 843502.5 was reached after 526 generations on average (i.e., that algorithm halted after 1526 generations on average)
Mung on January 7, 2016 at 11:31 pm said:

This makes it easy for brute force.

permutation method

I validated that it produced 6,227,020,800 results.

Running to find the “best” sequences now.
DiEb on January 7, 2016 at 11:36 pm said:

For the hexadecimal case it takes 3.268 mins to get to FCA864*EDB975+3-1/2 (again, one child, one parent, 1,000 evaluations…)
Richardthughes on January 7, 2016 at 11:44 pm said:

DiEb: 3) it took 2′ 37” – much faster than brute force

Each run, or total.
steveh on January 7, 2016 at 11:55 pm said:

My cheaty solution is to use reverse polish notation (RPN) for the expression
4 (4)
1 2 + ( 4 3)
3 ( 4 3 3)
– (4 0)
/ (infinity)
98765 * -> (lots)

My designy solution is to steal the + 3 – 1/2 part from everyone else and then to realise that we only need to consider 32 possible combinations to get the maximum product a,b of pairs of numbers formed from the remaining digits 9,8,7,6,5,4. We only have to consider 32 cases because each of the two numbers must be sorted in digit order, largest first eg 952 but not 925
.
So if you start with A=987654, one of the candidates would be
A=986 and the only possible max value for B would be 754.

To iterate through the possible pairs of values I generated a number, i, from 1 to 31 and wrote it in binary
00001,
00010,
00010,
00100, etc.

then, in each case starting with A=987654, move all digits from A to B where there is a 1 in the corresponding binary representaion of i.

eg, for iteration i=6 ( binary 000110)
Start with ‘987654’ Keep 987__4 in X and move ___65_ to Y to get 9874 x 65 =
641810

If you further reason that A must start with 9, and B must start with 8 there are only 16 possible candidates.

Edit:
Having read Mung’s post below, changed X zo ‘9876543’
and values of i from 0 to 63, I got A=964 B=8753 AB=8437892
in 0.1 seconds in python.
keiths on January 8, 2016 at 12:01 am said:

DiEb:

For the hexadecimal case it takes 3.268 mins to get to FCA864*EDB975+3-1/2 (again, one child, one parent, 1,000 evaluations…)

You obviously smuggled that solution in via your fitness function. Obviously.

Right, IDers?
Mung on January 8, 2016 at 12:14 am said:

This one has been leading for some time now:

1/-2+8753*964
Richardthughes on January 8, 2016 at 1:02 am said:

-2 is an illegal operator (for our purposes).

how long did the exhaustive run take everyone?
keiths on January 8, 2016 at 1:20 am said:

Rich:

how long did the exhaustive run take everyone?

My run took 12 1/2 hours.
Mung on January 8, 2016 at 1:25 am said:

Richardthughes: -2 is an illegal operator (for our purposes).

Silly me. I was taking the OP too literally. 🙂

But in that case there are not in fact 13! possibilities. Agreed?
Richardthughes on January 8, 2016 at 1:48 am said:

There are 13! Permutations , about 10% viable (math up above somewhere)
Richardthughes on January 8, 2016 at 2:00 am said:

keiths,

Should we submit to bio-complexity?
Richardthughes on January 8, 2016 at 4:42 am said:

And at this point, I feel I should give credit to Mung for rolling his sleeves up and coding.
Joe Felsenstein on January 8, 2016 at 5:02 am said:

I don’t want to participate in the calculations here, but there is a topic that needs raising. There are numbers of mathematical systems that have been suggested as analogues to evolution, ones which can be thoroughly investigated mathematically or computationally. An example is in Gregory Chaitin’s book Proving Darwin. Making Biology Mathematical where he uses mutations in computer programs and has a fitness that comes from the Busy Beaver Problem.

Chaitin is rightly famous as one of the founders of Algorithmic Information Theory. But he left a distressing gap in his argument. He regards it as uncontroversial that this Busy Beaver problem is a good analogy to biology. And he didn’t raise this issue and discuss it. Maybe it is a good analogy, maybe it isn’t. Some problems will be good analogies, some not. If the problem were mutating a number, trying to find the numbers that open a combination lock, then mutation will be extremely ineffective, so will operations in any genetic algorithm.

The way fitnesses change in a network of genotypes that are connected by mutations has a large effect on how effective simulated evolution will be at finding more fit genotypes. Some of these analogies to biology may be much better than others.

Is there some assumption here that the way fitnesses are arranged in this Math Genome problem is a particularly good parallel to the way genotypes, phenotypes, and fitnesses work in biology?
DiEb on January 8, 2016 at 5:08 am said:

Richardthughes,

-2 is an illegal operator (for our purposes).

Why? You wrote “If a string cannot execute as a mathematical function, it scores zero” Mung’s language of choice (Ruby, I think) executed the string “/-2” as a mathematical function….
Richardthughes on January 8, 2016 at 5:10 am said:

Hi Joe. thanks for commenting! I didn’t have a concrete aim but some of my thoughts were:

The target isn’t explicit in the code
Lots of genomes would not be viable due to syntax
The problem space was large (within home computing capabilities, but painful)
The problem is non-trivial and non-obvious.
Some people learn by doing.

That being said, I don’t think it simulates life but perhaps does simulate some of life’s problem solving mechanisms?

I’d be delighted if others could improve this, take it in new directions, etc.
Richardthughes on January 8, 2016 at 5:13 am said:

DiEb:
Richardthughes,

Why? You wrote “If a string cannot execute as a mathematical function, it scores zero” Mung’s language of choice (Ruby, I think) executed the string “/-2” as a mathematical function….

Yeah, I thought of that after the post (I’m not as smart as you guys!)

Richardthughes: Should we allow + and – to denote positive and negative? Should +9876*54/-321 be allowed? I’m thinking… not.

I’m fine with either version.. It’s not like I have a master plan

[/Shiftyeyes]

Mwuahahahahaha!!!!!
Richardthughes on January 8, 2016 at 5:18 am said:

Joe Felsenstein: If the problem were mutating a number, trying to find the numbers that open a combination lock, then mutation will be extremely ineffective, so will operations in any genetic algorithm.

So if we had we wanted the GA to resolve “as close to Pi as possible” it would have a hard time? Or does the combination lock give no sense of proximity and thus the GA fairs no better than other methods.
Joe Felsenstein on January 8, 2016 at 5:57 am said:

Richardthughes: So if we had we wanted the GA to resolve “as close to Pi as possible” it would have a hard time? Or does the combination lock give no sense of proximity and thus the GA fairs no better than other methods.

A combination lock genetic algorithm that only rewarded actually opening the lock would be horribly bad. One which rewarded for closeness to the correct combination would do much better.

Another one that might do badly would be a genome of 0s and 1s that was evaluated as a binary number and then rewarded if it turned out to be a prime number.
dazz on January 8, 2016 at 7:38 am said:

Richardthughes: how long did the exhaustive run take everyone?

Some 9 hours to complete using eval, 45 minutes to hit the max
There are 16 expressions that evaluate to 8437891.5:

[‘1/-2+8753*964’, ‘1/-2+964*8753’, ‘8753*964+1/-2’, ‘8753*964+-1/2’, ‘8753*964-1/+2’, ‘8753*964-+1/2’, ‘8753*+964-1/2’, ‘964*8753+1/-2’, ‘964*8753+-1/2’, ‘964*8753-1/+2’, ‘964*8753-+1/2’, ‘964*+8753-1/2’, ‘+8753*964-1/2’, ‘+964*8753-1/2’, ‘-1/2+8753*964’, ‘-1/2+964*8753’]

1828915200 valid expressions (29.4% of the total)

Using the more restricting grammar function it’s supposed to take about 6 hours. Just fired that up so will take a while.

Still want to get a threaded version working, to compare to the GA with larger populations
DiEb on January 8, 2016 at 10:58 am said:

I had to run the GA for Mung on my phone.
1) 100 trials took 2.33 min
2) 12 times the global maximum was found.
dazz on January 8, 2016 at 11:02 am said:

DiEb:
I had to run the GA for Mung on my phone.
1) 100 trials took 2.33 min
2) 12 times the global maximum was found.

Where did you start those runs?
DiEb on January 8, 2016 at 11:11 am said:

dazz,

Random viabe string.
Richardthughes on January 8, 2016 at 5:13 pm said:

DiEb: Random viabe string.

See, evolution cannot account for OOL!!!111111 😉
Richardthughes on January 8, 2016 at 5:20 pm said:

phoodoo,

Why don’t you have a go coding a solution, Phoodoo?
phoodoo on January 8, 2016 at 5:22 pm said:

Richardthughes,

Wrong. Try again.
Richardthughes on January 8, 2016 at 10:59 pm said:

So so far, as we work through the basics, intelligent agents can reason a solution quickly,GAs / EAs with a very limited population are performing admirably and we’re probably at the edge of brute force for our purposes.

Is there any interest (from team design especially) in taking this to the next level – a specific target with a longer genome and more operators?
Mung on January 9, 2016 at 1:58 am said:

Richardthughes: And at this point, I feel I should give credit to Mung for rolling his sleeves up and coding.

Well, my program eventually stopped due to running out of memory after more than 1.685B permutations. Guess maybe the built-in permutation function isn’t the way to go.

But now that I think about it, I was actually able to count the number of permutations in a previous run so maybe it was something else that hogged the memory. I’ll have to take another look.
Norm Olsen on January 9, 2016 at 2:01 am said:

The answer is 42.
Mung on January 9, 2016 at 2:05 am said:

Even at that I had not found a better string.

75,000,000: 1/-2+8753*964 = 8437891
.
.
1,685,000,000: 1/-2+8753*964 = 8437891
keiths on January 9, 2016 at 2:10 am said:

Mung,

Could you post your code?
Mung on January 9, 2016 at 2:14 am said:

How do we measure how much information had to be smuggled in to the GA?

😉
Mung on January 9, 2016 at 2:16 am said:

keiths: Could you post your code?

It’s short enough that I could easily post it here but when I tried the browser didn’t like it. If I can’t get it to go here I’ll try pastebin.
Norm Olsen on January 9, 2016 at 2:20 am said:

Mung,

Just take a screen shot of your code, doesn’t have to be complete, and upload the image here.
Mung on January 9, 2016 at 2:24 am said:

iterations = 0
highest_value = 0
winning_strings = []

%w[1 2 3 4 5 6 7 8 9 * / + -].shuffle.permutation.each do |arr|
iterations += 1
current_string = arr.join

begin
current_value = eval(current_string)
rescue SyntaxError
next # i don’t try to keep track of the failures
end

if current_value > highest_value
highest_value = current_value
winning_strings.clear
winning_strings << current_string
elsif current_value == highest_value
winning_strings << current_string
end

puts “#{iterations}:#{winning_strings.first}:#{highest_value}” if iterations % 5000000 == 0
end

puts iterations
puts highest_value
puts winning_strings
Mung on January 9, 2016 at 2:26 am said:

Just needed to do a little html for the greater than and double less than.
Mung on January 9, 2016 at 2:32 am said:

1685000000:1/-2+8753*964:8437891
perms.rb:23:in `eval’: failed to allocate memory (NoMemoryError)
from perms.rb:23:in `block in ‘
from perms.rb:17:in `permutation’
from perms.rb:17:in `each’
from perms.rb:17:in `’
Mung on January 9, 2016 at 2:36 am said:

Actually cycling through all the permutations only took 22 mins on my home PC, lol.

It’s everything else that’s taking all the time!
dazz on January 9, 2016 at 6:24 am said:

Down to 2 1/2 hours on 8 threads / 4 cores (brute-force) @ 4.5GHz Sandy Bridge
OMagain on January 9, 2016 at 10:46 am said:

So, the Mung really can code. Will surprises never end…
Mung on January 9, 2016 at 3:22 pm said:

OMagain: So, the Mung really can code. Will surprises never end…

Nah. I’m not stupid. I had someone else write it for me.
Mung on January 9, 2016 at 3:29 pm said:

Performing the actual permutations is well within the reach of a current PC.

But what did we learn about GA’s, if anything? That this is not the sort of problem that GA’s are good at solving?
dazz on January 9, 2016 at 3:49 pm said:

Mung: That this is not the sort of problem that GA’s are good at solving?

It takes almost 3 hours to complete, and 30 minutes to reach the maximum by brute forcing on an overclocked 4.5GHz CPU running 8 parallel workers.
DiEb’s GA reached a maximum in 2.3 minutes with a population of 1… and his CPU is probably not as fast as mine

So much for not being good at solving the problem! LOL