Math Genome Fun 2, UPB and beyond?

Posted on January 10, 2016 by Richardthughes

This is a follow-on from the original “Math Genome Fun” thread here

I think it’s time to test out GAs on a bigger problem, the old one was at the boundary of home computation for an exhaustive search (OMD I said “search!” get on that Mung) – but I’m going to propose a *much* bigger search / smaller target.

Friends, you target is Pi

I’ve chosen Pi because of its qualities, being irrational and transcendental you cannot describe it in an equation, only approximate it.

Our target is the first 150 digits of Pi.

Our vocabulary is the same numbers and operators as before: [1,2,3,4,5,6,7,8,9,+,-,*,/], but this time they can all be used as often as required and the genome can be anywhere from 1 to 1000 characters in length.

The design approximation is now harder due to the decimal place and the absence of zeros in our vocabulary (but there’s quite a few in Pi).

I’ve done no computation on this myself, can a GA find that proverbial needle in a haystack?

131 thoughts on “Math Genome Fun 2, UPB and beyond?”

stcordova on January 14, 2016 at 5:30 pm said:

Pi is non-computable in the sense that we are using the term here

Maybe only practically but not formally since it can be algorithmically described. It’s best not to use idiosyncratic definitions that no one else uses if one is trying to communicate.

In the literature, Pi is a computable number but not terminating.

https://en.wikipedia.org/wiki/Computable_number

The computable numbers include many of the specific real numbers which appear in practice, including all real algebraic numbers, as well as e, \pi, and many other transcendental numbers.

Sorry to disagree old friend.
Richardthughes on January 14, 2016 at 5:34 pm said:

dazz:
On Linux:

Best = [’16/113+3′]
Current PI= 3.14159292035 found at: 00:01:05
Mininum difference= 2.66764189405e-07
Total Iterations= 115224568 – 1.24926726949e-09 %
Valid Expressions = 49606038
Iterations/s( /core ) = 623945.540505 ( 89135.077215 )
Avg. Iterations/s ( /core ) = 677452.730241 ( 96778.9614629 )

Time elapsed = 00:02:50

And…

Best = [‘3+32/226’, ‘3+48/339′, ’16/113+3’, ‘3+96/678’]
Current PI= 3.14159292035 found at: 00:02:56
Mininum difference= 2.66764189405e-07
Total Iterations= 121577256 – 1.3181432508e-09 %
Valid Expressions = 52416654
Iterations/s( /core ) = 635266.073734 ( 90752.2962477 )
Avg. Iterations/s ( /core ) = 675110.124224 ( 96444.3034606 )

Time elapsed = 00:03:00

Nice work Dazz. What mutations are you using? Your genome isn’t getting longer…
keiths on January 14, 2016 at 7:55 pm said:

Rich,

I don’t think Dazz is using a GA. He/she wrote:

My brute force algo found this in 2 1/2 minutes (8 threads @ 4.8GHz):
keiths on January 14, 2016 at 8:30 pm said:

I have two versions of the GA, pi_hard and pi_easy.

The pi_hard version approximates pi (or any other constant) using the (harder) rules of the previous thread — genomes are exactly thirteen characters long, with one instance of each of [1,2,3,4,5,6,7,8,9,+,-,*,/].

Using pi_hard, the best genome I’ve found is 312/678*9+4-5 . The GA usually finds it in less than five minutes.

The pi_easy version approximates pi (or any other constant) using genomes of 1-1000 characters taken from [1,2,3,4,5,6,7,8,9,+,-,*,/].

It finds a solution in seconds, and it achieves the necessary precision by adding digits to the numerator and denominator of a fraction (which seems to be different each time). Example:

Winning genome: 8385577787772322/2669212311211131

value = 3.14159265358979311599796346854418516159057617187500000000000000000
target = 3.14159265358979311599796346854418516159057617187500000000000000000

Note that the target is pi, represented as a 64-bit double-precision floating-point number.
Richardthughes on January 14, 2016 at 8:46 pm said:

keiths,

Great work KeithS. There seem to be no mathematical operators in your winning genome?

What was your GA population?
keiths on January 14, 2016 at 9:03 pm said:

Rich,

There seem to be no mathematical operators in your winning genome?

There’s a “/”. In general, the GA produces an expression that involves a fraction, then fine-tunes (OMG – I said “fine tune”! Get on that, Mung) the numerator and denominator to achieve the needed precision.

Here are some winning genomes from successive runs:

2*2/72389288135998399*56854413951661833

582926767643923993/92775676531111219/2

176923666738774596/56316552222838257

5/1-64131322211214211*7/241561278987437498

8/23*75417246669186771/8349932244717811

What was your GA population?

Ten, with four survivors each generation. I didn’t try to fine-tune (OMG) it, though.
DiEb on January 14, 2016 at 9:43 pm said:

The continued fraction of π gives the best approximation as a fraction of integers:
[3,7,15,1,292,1,1,1,2,1,3,1,14,2,1,1,2,2,2,2,1,84,2,1,1,15,3,13] =
5,706,674,932,067,741/1,816,491,048,114,374
The difference to π is less than 2.5E-31:

Fraction ∼3.141 592 653 589 793 238 462 643 383 279 736…
π ∼3.141 592 653 589 793 238 462 643 383 279 502…
Richardthughes on January 14, 2016 at 10:05 pm said:

My aim was going to be to get to enough decimal places (specitivity) for the UPB – 150. It looks like the evaluation part of the procedure will make that impossible, although the GAs may also have a hard time. At some point I suspect the GA may destroy more precision than it incrementally adds, but I don’t know. Of course PI is selected as the only winning hand before the fact, which is a problem for ID.
keiths on January 14, 2016 at 11:00 pm said:

Rich,

My aim was going to be to get to enough decimal places (specitivity) for the UPB – 150. It looks like the evaluation part of the procedure will make that impossible…

It’s quite possible with the use of an arbitrary-precision math library, but there’s no point in going through the exercise. The same algorithm that takes you from…

21/7 = 3.0 to
219/71 = 3.085

…can also take you from…

1828/582 = 3.1409 to
18286/5821 = 3.1414

…and can also take you from…

1642172919/522719875 = 3.14159265324 to
16421729195/5227198751 = 3.14159265359

Same mutation type. Different precision.

…although the GAs may also have a hard time.

The GA wouldn’t have any trouble. It would just keep adding digits to the numerator and denominator until it reached the desired precision. Every step is an incremental improvement. It’s quite easy for a GA.

The limiting factor is the precision of the arithmetic, not the ability of the GA.
Mung on January 16, 2016 at 12:10 am said:

So how much information was smuggled in Richard?
keiths on January 16, 2016 at 12:18 am said:

Mung,

So how much information was smuggled in Richard?

You tell us, Mung. And show your work.
Mung on January 16, 2016 at 3:38 am said:

Ah yes, anti-skeptic keiths and his alternate reality.

Surely he can show where I claimed that information was being smuggled in.

Or not.
Alan Fox on January 16, 2016 at 2:26 pm said:

Moved some comments to guano.
OMagain on January 16, 2016 at 2:55 pm said:

Mung: Surely he can show where I claimed that information was being smuggled in.

Huh?

Mung: So how much information was smuggled in Richard?

That was easy!
Mung on January 19, 2016 at 3:01 am said:

From the OP:

I’ve done no computation on this myself, can a GA find that proverbial needle in a haystack?

Silly question. Of course a GA can find that proverbial needle in a haystack, and if you’d ever actually written a GA you’d know it. I’m probably wrong though, because I don’t know squat about GA’s.

Here’s the truth about GA’s:

1. If there is only one needle in the haystack it can’t be found by a GA.

2. GA’s are not search algorithms anyways, therefore a GA cannot find a needle in a haystack.
Richardthughes on January 19, 2016 at 3:25 am said:

Mung,

Insightful. Do they work? Can they create information?
cubist on January 19, 2016 at 6:52 am said:

OMagain:

Mung: Surely he can show where I claimed that information was being smuggled in.

Huh?

Mung: So how much information was smuggled in Richard?

That was easy!

Ah, but Mung did not make a direct, explicit claim to the effect that information had been smuggled in. Rather, Mung asked a question which was built on the implied, unstated presupposition that information had been smuggled in. Clearly not the same thing at all.

In much the same way that the question “So how much information was smuggled in Richard?” does not constitute a claim that information was smuggled in, so, too, does the question “Mung, have you stopped selling crack to high-school students?” not constitute a claim that Mung sells illicit drugs to teenagers.
dazz on January 19, 2016 at 10:55 am said:

Richardthughes: Nice work Dazz. What mutations are you using? Your genome isn’t getting longer…

keiths:
Rich,

I don’t think Dazz is using a GA.He/she wrote:

Keiths is correct, I just re-used the threaded script for the previous exercise to brute force this thing, for comparison, and to learn more about threading in Python.
Interestingly, using eval concurrently my threads dead-locked after a while, not sure why. I managed to fix it using a custom “eval” function I found at stackoverflow using the ast module.

Anyway, I may write my own GA at some point, but for the time being, all my results are brute-force, sequential evaluations of combinations of symbols.

The longest I’ve run it is 8 hours and a half and this is what I got after 14108060638 expressions checked: not much improvement after the first few minutes

Best = [’15/339*71′, ‘355/565*5’, ‘426/678*5’, ‘5/226*142’, ‘5/565*355′, ’71/113*5′, ’25/565*71’, ‘284/452*5’, ‘355/791*7’, ‘4/452*355’, ‘5/452*284’, ‘2/226*355′, ’35/791*71’, ‘3/339*355’, ‘497/791*5’, ‘5/339*213’, ‘5/678*426’, ‘5/113*71’, ‘142/226*5’, ‘1/113*355’, ‘213/339*5’, ‘5/791*497’, ‘6/678*355’, ‘7/791*355’]
Current PI = 3.141592920353982076875354 found at: 00:13:35
Mininum difference = 0.000000266764188960877391
Total Expressions = 14108060638 – 656.957768117 %
Valid Expressions = 756879
Expressions/s ( /core ) = 461456 ( 115364 )
Avg. Expressions/s ( /core ) = 461456 ( 115364 )

Last Expression = 143-2*6831
Dead Threads = []
keiths on January 21, 2016 at 1:43 am said:

I gave my GA a harder problem to work on, which is to approximate pi using only the digit ‘1’ plus the four operators +,-,*,/.

It’s been going for a couple of hours. The best genome so far is

1+1+11*11*11*11/1111111*11+1

…for an error of 0.0034 .
Richardthughes on January 21, 2016 at 1:57 am said:

keiths,

Awesome! I see you remembered to smuggle the CSI in.
keiths on January 21, 2016 at 2:47 am said:

ichRay,

ixNay on the ugglesmay. I think Mung is on to us.
keiths on January 21, 2016 at 6:25 am said:

Tried it using twos instead of ones.

Best genome after two hours:

2+2/2+2222222/2/22222222+2/22

Error = 0.0006

If I keep smuggling information into the GA, will it eventually fill up?
keiths on January 22, 2016 at 4:13 am said:

Best genome using threes, after nine and a half hours:

3/333+3+33*33/33333+333/3333

Error = 0.000003

Best genome using fours, after four hours:

4-4/4+4444444/444/444/444+4/44

Error 0.00009
Richardthughes on January 22, 2016 at 5:55 am said:

keiths,

Beds will be shat when you get to “6”.
keiths on January 22, 2016 at 6:21 am said:

Beds will be shat when you get to “6”.

I should probably restrict that one to using combinations of 666, for maximum evilutionist effect, but I can’t be arsed. Yet.
Mung on January 23, 2016 at 2:23 am said:

cubist: Ah, but Mung did not make a direct, explicit claim to the effect that information had been smuggled in.

Indeed I did not. Nor was I the one to raise the question. But it’s ok with me if you can’t be bothered with the facts. I never assumed otherwise.

ETA: But at least you managed to utter a true statement, even if it was an attempt at implying the opposite.
Mung on January 23, 2016 at 2:33 am said:

cubist: Rather, Mung asked a question which was built on the implied, unstated presupposition that information had been smuggled in.

This is false. I was not the one to raise the question of “smuggled in” information. I was merely following up on a claim made by someone else. Once you figure out who that was, you will understand.
Richardthughes on January 23, 2016 at 4:05 am said:

We gave up on you learning GAs just after you did, Mung.
keiths on January 24, 2016 at 7:01 pm said:

5-555555/5555/55-555/555/5/5

Error = 0.000046

6-66/6/6-6/6/6/6-6/6

Error = 0.0027

Interestingly, the error with the sixes is considerably higher than most of the others so far, perhaps due to Satanic influences.
keiths on January 25, 2016 at 3:22 am said:

7/7/7-7-7/77-77/777/77+777/77

Error = 0.00002

88/8-8+88/8/8/888/8*88+8/8/8

Error = 0.00044
keiths on January 25, 2016 at 7:34 pm said:

999/9/9-9+9/9999*999-999/99+9

Error = 0.000021