Math Genome Fun 2, UPB and beyond?

This is a follow-on from the original “Math Genome Fun” thread here

I think it’s time to test out GAs on a bigger problem, the old one was at the boundary of home computation for an exhaustive search (OMD I said “search!” get on that Mung) – but I’m going to propose a *much* bigger search / smaller target.

Friends, you target is Pi

I’ve chosen Pi because of its qualities, being irrational and transcendental you cannot describe it in an equation, only approximate it.

Our target is the first 150 digits of Pi.

Our vocabulary is the same numbers and operators as before: [1,2,3,4,5,6,7,8,9,+,-,*,/], but this time they can all be used as often as required and the genome can be anywhere from 1 to 1000 characters in length.

The design approximation is now harder due to the decimal place and the absence of zeros in our vocabulary (but there’s quite a few in Pi).

I’ve done no computation on this myself, can a GA find that proverbial needle in a haystack?

 

131 thoughts on “Math Genome Fun 2, UPB and beyond?

  1. Pi is non-computable in the sense that we are using the term here

    Maybe only practically but not formally since it can be algorithmically described. It’s best not to use idiosyncratic definitions that no one else uses if one is trying to communicate.

    In the literature, Pi is a computable number but not terminating.

    https://en.wikipedia.org/wiki/Computable_number

    The computable numbers include many of the specific real numbers which appear in practice, including all real algebraic numbers, as well as e, \pi, and many other transcendental numbers.

    Sorry to disagree old friend.

  2. dazz:
    On Linux:

    Best = [’16/113+3′]
    Current PI= 3.14159292035 found at: 00:01:05
    Mininum difference= 2.66764189405e-07
    Total Iterations= 115224568 – 1.24926726949e-09 %
    Valid Expressions = 49606038
    Iterations/s( /core ) = 623945.540505 ( 89135.077215 )
    Avg. Iterations/s ( /core ) = 677452.730241 ( 96778.9614629 )

    Time elapsed = 00:02:50

    And…

    Best = [‘3+32/226’, ‘3+48/339′, ’16/113+3’, ‘3+96/678’]
    Current PI= 3.14159292035 found at: 00:02:56
    Mininum difference= 2.66764189405e-07
    Total Iterations= 121577256 – 1.3181432508e-09 %
    Valid Expressions = 52416654
    Iterations/s( /core ) = 635266.073734 ( 90752.2962477 )
    Avg. Iterations/s ( /core ) = 675110.124224 ( 96444.3034606 )

    Time elapsed = 00:03:00

    Nice work Dazz. What mutations are you using? Your genome isn’t getting longer…

  3. Rich,

    I don’t think Dazz is using a GA. He/she wrote:

    My brute force algo found this in 2 1/2 minutes (8 threads @ 4.8GHz):

  4. I have two versions of the GA, pi_hard and pi_easy.

    The pi_hard version approximates pi (or any other constant) using the (harder) rules of the previous thread — genomes are exactly thirteen characters long, with one instance of each of [1,2,3,4,5,6,7,8,9,+,-,*,/].

    Using pi_hard, the best genome I’ve found is 312/678*9+4-5 . The GA usually finds it in less than five minutes.

    The pi_easy version approximates pi (or any other constant) using genomes of 1-1000 characters taken from [1,2,3,4,5,6,7,8,9,+,-,*,/].

    It finds a solution in seconds, and it achieves the necessary precision by adding digits to the numerator and denominator of a fraction (which seems to be different each time). Example:

    Winning genome: 8385577787772322/2669212311211131

    value = 3.14159265358979311599796346854418516159057617187500000000000000000
    target = 3.14159265358979311599796346854418516159057617187500000000000000000

    Note that the target is pi, represented as a 64-bit double-precision floating-point number.

  5. keiths,

    Great work KeithS. There seem to be no mathematical operators in your winning genome?

    What was your GA population?

  6. Rich,

    There seem to be no mathematical operators in your winning genome?

    There’s a “/”. In general, the GA produces an expression that involves a fraction, then fine-tunes (OMG – I said “fine tune”! Get on that, Mung) the numerator and denominator to achieve the needed precision.

    Here are some winning genomes from successive runs:

    2*2/72389288135998399*56854413951661833

    582926767643923993/92775676531111219/2

    176923666738774596/56316552222838257

    5/1-64131322211214211*7/241561278987437498

    8/23*75417246669186771/8349932244717811

    What was your GA population?

    Ten, with four survivors each generation. I didn’t try to fine-tune (OMG) it, though.

  7. The continued fraction of π gives the best approximation as a fraction of integers:
    [3,7,15,1,292,1,1,1,2,1,3,1,14,2,1,1,2,2,2,2,1,84,2,1,1,15,3,13] =
    5,706,674,932,067,741/1,816,491,048,114,374
    The difference to π is less than 2.5E-31:

    Fraction ∼3.141 592 653 589 793 238 462 643 383 279 736…
    π ∼3.141 592 653 589 793 238 462 643 383 279 502…

  8. My aim was going to be to get to enough decimal places (specitivity) for the UPB – 150. It looks like the evaluation part of the procedure will make that impossible, although the GAs may also have a hard time. At some point I suspect the GA may destroy more precision than it incrementally adds, but I don’t know. Of course PI is selected as the only winning hand before the fact, which is a problem for ID.

  9. Rich,

    My aim was going to be to get to enough decimal places (specitivity) for the UPB – 150. It looks like the evaluation part of the procedure will make that impossible…

    It’s quite possible with the use of an arbitrary-precision math library, but there’s no point in going through the exercise. The same algorithm that takes you from…

    21/7 = 3.0 to
    219/71 = 3.085

    …can also take you from…

    1828/582 = 3.1409 to
    18286/5821 = 3.1414

    …and can also take you from…

    1642172919/522719875 = 3.14159265324 to
    16421729195/5227198751 = 3.14159265359

    Same mutation type. Different precision.

    …although the GAs may also have a hard time.

    The GA wouldn’t have any trouble. It would just keep adding digits to the numerator and denominator until it reached the desired precision. Every step is an incremental improvement. It’s quite easy for a GA.

    The limiting factor is the precision of the arithmetic, not the ability of the GA.

  10. Mung,

    So how much information was smuggled in Richard?

    You tell us, Mung. And show your work.

  11. Ah yes, anti-skeptic keiths and his alternate reality.

    Surely he can show where I claimed that information was being smuggled in.

    Or not.

  12. Mung: Surely he can show where I claimed that information was being smuggled in.

    Huh?

    Mung: So how much information was smuggled in Richard?

    That was easy!

  13. From the OP:

    I’ve done no computation on this myself, can a GA find that proverbial needle in a haystack?

    Silly question. Of course a GA can find that proverbial needle in a haystack, and if you’d ever actually written a GA you’d know it. I’m probably wrong though, because I don’t know squat about GA’s.

    Here’s the truth about GA’s:

    1. If there is only one needle in the haystack it can’t be found by a GA.

    2. GA’s are not search algorithms anyways, therefore a GA cannot find a needle in a haystack.

  14. OMagain:

    Mung: Surely he can show where I claimed that information was being smuggled in.

    Huh?

    Mung: So how much information was smuggled in Richard?

    That was easy!

    Ah, but Mung did not make a direct, explicit claim to the effect that information had been smuggled in. Rather, Mung asked a question which was built on the implied, unstated presupposition that information had been smuggled in. Clearly not the same thing at all.

    In much the same way that the question “So how much information was smuggled in Richard?” does not constitute a claim that information was smuggled in, so, too, does the question “Mung, have you stopped selling crack to high-school students?” not constitute a claim that Mung sells illicit drugs to teenagers.

  15. Richardthughes: Nice work Dazz. What mutations are you using? Your genome isn’t getting longer…

    keiths:
    Rich,

    I don’t think Dazz is using a GA.He/she wrote:

    Keiths is correct, I just re-used the threaded script for the previous exercise to brute force this thing, for comparison, and to learn more about threading in Python.
    Interestingly, using eval concurrently my threads dead-locked after a while, not sure why. I managed to fix it using a custom “eval” function I found at stackoverflow using the ast module.

    Anyway, I may write my own GA at some point, but for the time being, all my results are brute-force, sequential evaluations of combinations of symbols.

    The longest I’ve run it is 8 hours and a half and this is what I got after 14108060638 expressions checked: not much improvement after the first few minutes

    Best = [’15/339*71′, ‘355/565*5’, ‘426/678*5’, ‘5/226*142’, ‘5/565*355′, ’71/113*5′, ’25/565*71’, ‘284/452*5’, ‘355/791*7’, ‘4/452*355’, ‘5/452*284’, ‘2/226*355′, ’35/791*71’, ‘3/339*355’, ‘497/791*5’, ‘5/339*213’, ‘5/678*426’, ‘5/113*71’, ‘142/226*5’, ‘1/113*355’, ‘213/339*5’, ‘5/791*497’, ‘6/678*355’, ‘7/791*355’]
    Current PI = 3.141592920353982076875354 found at: 00:13:35
    Mininum difference = 0.000000266764188960877391
    Total Expressions = 14108060638 – 656.957768117 %
    Valid Expressions = 756879
    Expressions/s ( /core ) = 461456 ( 115364 )
    Avg. Expressions/s ( /core ) = 461456 ( 115364 )

    Last Expression = 143-2*6831
    Dead Threads = []

  16. I gave my GA a harder problem to work on, which is to approximate pi using only the digit ‘1’ plus the four operators +,-,*,/.

    It’s been going for a couple of hours. The best genome so far is

    1+1+11*11*11*11/1111111*11+1

    …for an error of 0.0034 .

  17. Tried it using twos instead of ones.

    Best genome after two hours:

    2+2/2+2222222/2/22222222+2/22

    Error = 0.0006

    If I keep smuggling information into the GA, will it eventually fill up?

  18. Best genome using threes, after nine and a half hours:

    3/333+3+33*33/33333+333/3333

    Error = 0.000003

    Best genome using fours, after four hours:

    4-4/4+4444444/444/444/444+4/44

    Error 0.00009

  19. Beds will be shat when you get to “6”.

    I should probably restrict that one to using combinations of 666, for maximum evilutionist effect, but I can’t be arsed. Yet.

  20. cubist: Ah, but Mung did not make a direct, explicit claim to the effect that information had been smuggled in.

    Indeed I did not. Nor was I the one to raise the question. But it’s ok with me if you can’t be bothered with the facts. I never assumed otherwise.

    ETA: But at least you managed to utter a true statement, even if it was an attempt at implying the opposite.

  21. cubist: Rather, Mung asked a question which was built on the implied, unstated presupposition that information had been smuggled in.

    This is false. I was not the one to raise the question of “smuggled in” information. I was merely following up on a claim made by someone else. Once you figure out who that was, you will understand.

  22. 5-555555/5555/55-555/555/5/5

    Error = 0.000046

    6-66/6/6-6/6/6/6-6/6

    Error = 0.0027

    Interestingly, the error with the sixes is considerably higher than most of the others so far, perhaps due to Satanic influences.

  23. 7/7/7-7-7/77-77/777/77+777/77

    Error = 0.00002

    88/8-8+88/8/8/888/8*88+8/8/8

    Error = 0.00044

Leave a Reply