# Sandbox (4)

Sometimes very active discussions about peripheral issues overwhelm a thread, so this is a permanent home for those conversations.

I’ve opened a new “Sandbox” thread as a post as the new “ignore commenter” plug-in only works on threads started as posts.

## 1,839 thoughts on “Sandbox (4)”

1. But you make a good point that cryptos (and those of us supporting them) are partially responsible for those CO2 emissions and its impact on climate change

2. keiths: As with dazz, your mistake is that you aren’t honoring the problem specification. The description says nothing about “winning and ending the game”. Instead, it defines two tasks and then poses a couple of challenges to the reader:

Welcome back, keiths.
I was explaining why dazz’s initial simulation gave the erroneous result that he reported “I run a quick simulation and I get 50-50”.
You might notice that there is no mention of average number of flips, so it is OBVIOUS that he is not “honoring the problem specification”. He’s got a whole different problem going on…
And yes, as you noted, the premature termination is outside of the spec. But you also wrote that it was not the “same coin/different coin” that caused his result.

Bruce:

If you started with thinking they were playing against each other with the same coin, then I feel I am in good company…

dazz:

Yep 😄

Not sure why you guys think it matters. As long as we are talking about fair coins flipped fairly, the identity of the coin(s) doesn’t influence the result, since the coins have no “memory”. The identity of the flipper(s) is also irrelevant.

This was wrong. You made this mistake because you failed to pick up on the fact that Bruce and dazz were modeling a competition.
For the problem, as specified, it matters not whether there is one coin or two: there is no “competition” nor “premature termination” and the average number of flips needed differs, 6 vs 4. See my intuitive explanation of the difference.
For the dazz’s ‘competition’ problem, the number of coins DOES matter.
If you have a single coin, you get 50/50, which is what dazz did. If you have two coins you get a different wrong answer, as I explained:
32% HH wins, 54% HT wins, and 14% ties.
So for dazz to get the wrong answer that he did, he needs premature termination AND A SINGLE COIN.
Please slow down and read what people write, not what you think they have written.

3. Mung: It’s in the interview with Suzan Mazur that I linked at PS.

“But it doesn’t matter who believes what. In 50 years we’ll all be dead … “

And later:

“I don’t really care who believes what. Over time, as I said, in the long run we’re all dead. It doesn’t matter who believes what. The question is, who’s got the story right? I think the central story of the theory of evolution is wrong in a way that can’t be repaired, saving the properties of the evolutionary theory”

4. keiths, to Jock:

As with dazz, your mistake is that you aren’t honoring the problem specification. The description says nothing about “winning and ending the game”. Instead, it defines two tasks and then poses a couple of challenges to the reader:

DNA_Jock:

Welcome back, keiths.

No thanks to you. It was Mung who did the right thing by standing up to Alan and removing me from pre-moderation.

I was explaining why dazz’s initial simulation gave the erroneous result that he reported “I run a quick simulation and I get 50-50”.

Slow down, Jock. That result isn’t erroneous. Each task does in fact win 50% of the time if a single coin is used and the spec is honored. This also happens when a single coin is used and the spec is violated by prematurely terminating the other task as soon as one of the tasks reaches its target.

You might notice that there is no mention of average number of flips, so it is OBVIOUS that he is not “honoring the problem specification”.

No, it isn’t obvious, because a) if the spec is honored, the 50-50 result is expected, contrary to your belief; and b) the spec doesn’t forbid a competition, just as it doesn’t forbid the observation of the same coin for both tasks. What it does forbid is premature termination.

It didn’t become obvious that dazz was violating the spec until he wrote this:

In my simulation I stop every run after either of the two contestants win it, which is wrong.

Jock:

And yes, as you noted, the premature termination is outside of the spec. But you also wrote that it was not the “same coin/different coin” that caused his result.

Here’s what I wrote:

Then the problem is the premature termination, not the fact that they are observing the same coin.

My statement is correct. If you honor the spec and don’t terminate prematurely, then you get the correct average number of flips whether you use one coin or two.

Bruce, earlier:

If you started with thinking they were playing against each other with the same coin, then I feel I am in good company…

dazz, also earlier:

Yep 😄

keiths, also earlier:

Not sure why you guys think it matters. As long as we are talking about fair coins flipped fairly, the identity of the coin(s) doesn’t influence the result, since the coins have no “memory”. The identity of the flipper(s) is also irrelevant.

Jock, now:

This was wrong. You made this mistake because you failed to pick up on the fact that Bruce and dazz were modeling a competition.

It was obvious that they were modeling a competition. Bruce’s phrase — “playing against each other” — left no doubt. But the problem wasn’t that they were modeling a competition — it was that they were prematurely terminating the losing task.

For the problem, as specified, it matters not whether there is one coin or two:

Right. That’s why I said this:

Then the problem is the premature termination, not the fact that they are observing the same coin.

Jock:

If you have a single coin, you get 50/50, which is what dazz did. If you have two coins you get a different wrong answer, as I explained:
32% HH wins, 54% HT wins, and 14% ties.

Neither of those answers is wrong. They reflect the win/loss/tie stats you get when you implement the spec correctly, which are the same stats you get when you violate the spec by prematurely terminating the losing task.

So for dazz to get the wrong answer that he did, he needs premature termination AND A SINGLE COIN.

No. You get the 50/50 result with a single coin whether or not you implement premature termination.

Let me summarize:

1. If one coin is used, you get the 50/50 result regardless of whether you implement premature termination. If two coins are used, you get the 32/54/14 result regardless of whether you implement premature termination.

2. On the other hand, if you implement premature termination, you get the wrong average number of flips regardless of whether you use one coin or two. If you honor the spec and don'[t terminate prematurely, you get the correct average number of flips (6 and 4 for for tasks 1 and 2, respectively) regardless of whether you use one coin or two.

3. The mistake dazz and Bruce made was not that they modeled a competition, nor that they shared the same coin between the two tasks. Their mistake was to prematurely terminate the losing task when the winning task reached its target.

5. @ Keiths

The result is not deep, but it reminds us that the human intuition gets confused by conditional probability.

concludes the article. Is there a deep point we are missing?

6. Alan:

@ Keiths

The result is not deep, but it reminds us that the human intuition gets confused by conditional probability.

concludes the article. Is there a deep point we are missing?

The result isn’t deep, but it’s interesting because it is so counterintuitive.

7. Alan,

I’m just an old man unversed in the ways of mathematical probability but let me ask Keiths did he look at the blog post I linked to and, if so, does he disagree with it?

I don’t see anything to disagree with. The post confirms what I told Jock above: namely, that premature termination is not allowed, that the average number of flips is six for task 1 (H H) and four for task 2 (H T), and that each task wins half of the time if you use a single coin.

0
8. Alan:

QED

Right. My disagreement was with dazz and Bruce (and now DNA_Jock), not with the author of that blog post.

9. keiths: It was obvious that they were modeling a competition. Bruce’s phrase — “playing against each other” — left no doubt. But the problem wasn’t that they were modeling a competition — it was that they were prematurely terminating the losing task.

So you are saying that not only did I make a mistake, but then I made a mistake about what that mistake was.

Hah! No doubt you are right again. Good ol’ KeithS, Welcome back.

10. Bruce,

So you are saying that not only did I make a mistake, but then I made a mistake about what that mistake was.

Yep.

Hah! No doubt you are right again.

You can convince yourself of that by running a simulation that models a competition, but without prematurely terminating a task when the other task reaches its target. If you do that, you’ll find that you get the correct average number of flips to reach the target: 6 for the “H H” task and 4 for the “H T” task, regardless of whether you use one coin or two.

Using the same coin is not a problem, nor is modeling a competition. The mistake is to terminate a task prematurely when the other task reaches its target.

Good ol’ KeithS, Welcome back.

Thanks.

11. You can convince yourself of that by running a simulation t

No need. Your arguments are convincing on their own.

12. I’m reminded of observations made long ago about AI (artificial intelligence).AI systems are very brittle.Small failures can have devastating effects.By contrast, natural (biological) intelligence tends to be more resilient, more robust.

I posted this links in the BB thread,. They speak to the above and so I will repost here for convenience:
https://medium.com/intuitionmachine/why-hasnt-evolution-invented-the-wheel-4de68f31ffb6
https://medium.com/intuitionmachine/the-curious-case-of-agi-without-a-body-32d5752dc14

I don’t much care for that site, though.

I browsed through the first linked article.

When I tried the second one, it told me that I have to sign up for the site to read it. Okay, I cheated. I then opened in a private browsing window so that I would not know this was a second visit. And that worked.

0

I don’t much care for that site, though.

I browsed through the first linked article.

When I tried the second one, it told me that I have to sign up for the site to read it.Okay, I cheated.I then opened in a private browsing window so that I would not know this was a second visit.And that worked.

Right, I forgot about the paywall. I do pay for that site, but most of the content is garbage writing by people who seem to be more interested in learning how to write than in doing the work needed to have something interesting, original, and valid to say. However, now and then there are interesting articles like the linked ones.

The author of the article has many other ones that strike me as correct, eg in their approach to AI, intiontion versus rationality, adaption not optimization.

15. BruceS: , intiontion

“intuition”.
Sorry for the typos. WordPress seems to intermittently lock me out of editing, even if I sign off and on and/or exit all my browser instances.

Likely something to do with my various ad blockers or cookie settings.

16. keiths:
keiths, to Jock:

DNA_Jock:

No thanks to you.It was Mung who did the right thing by standing up to Alan and removing me from pre-moderation.

Slow down, Jock.That result isn’t erroneous.Each task does in fact win 50% of the time if a single coin is used and the spec is honored.This also happens when a single coin is used and the spec is violated by prematurely terminating the other task as soon as one of the tasks reaches its target.

No, it isn’t obvious, because a) if the spec is honored, the 50-50 result is expected, contrary to your belief; and b) the spec doesn’t forbid a competition, just as it doesn’t forbid the observation of the same coin for both tasks.What it does forbid is premature termination.

It didn’t become obvious that dazz was violating the spec until he wrote this:

Jock:

Here’s what I wrote:

My statement is correct.If you honor the spec and don’t terminate prematurely, then you get the correct average number of flips whether you use one coin or two.

Bruce, earlier:

dazz, also earlier:

keiths, also earlier:

Jock, now:

It was obvious that they were modeling a competition.Bruce’s phrase — “playing against each other” — left no doubt.But the problem wasn’t that they were modeling a competition — it was that they were prematurely terminating the losing task.

Right.That’s why I said this:

Jock:

Neither of those answers is wrong.They reflect the win/loss/tie stats you get when you implement the spec correctly, which are the same stats you get when you violate the spec by prematurely terminating the losing task.

No.You get the 50/50 result with a single coin whether or not you implement premature termination.

Let me summarize:

1. If one coin is used, you get the 50/50 result regardless of whether you implement premature termination.If two coins are used, you get the 32/54/14 result regardless of whether you implement premature termination.

2. On the other hand, if you implement premature termination, you get the wrong average number of flips regardless of whether you use one coin or two.If you honor the spec and don'[t terminate prematurely, you get the correct average number of flips (6 and 4 for for tasks 1 and 2, respectively) regardless of whether you use one coin or two.

3. The mistake dazz and Bruce made was not that they modeled a competition, nor that they shared the same coin between the two tasks.Their mistake was to prematurely terminate the losing task when the winning task reached its target.

Thanks for that summary.

17. walto: 2. On the other hand, if you implement premature termination, you get the wrong average number of flips regardless of whether you use one coin or two.

Interestingly, you get different ‘wrong’ answers for {task 2, average to HT} depending on the number of coins you use, one or two.

18. Jock:

Interestingly, you get different ‘wrong’ answers for {task 2, average to HT} depending on the number of coins you use, one or two.

You also get different wrong answers for {task 1, average to HH} depending on the number of coins, though the delta is smaller than it is for {task 2, average to HT}.

Both facts make sense if you consider the types of sequences that are contributing to the statistics in the two cases.

Let’s define a winning coin-flip sequence as one via which a task reaches its target before or at the same time as the other task.* When premature termination is implemented, it is only the winning sequences that contribute to the statistics.

What does a winning sequence look like in the one-coin case? It always has the form “zero or more consecutive T’s followed immediately by the target subsequence, which is HH or HT.” Why? Because as soon as an “H” appears, one or the other task is invariably going to win on the next flip. If the next flip is an “H”, then task 1 wins. If it’s a “T”, then task 2 wins. Either way, the sequence terminates.

Thus, the following are all winning sequences in the one-coin case…

1) HH
2) THT
3) TTTTTTTTHH
4) TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTHT

…because each consists of zero or more consecutive T’s followed by one of the two target subsequences.

Now consider the two-coin case. Winning sequences still end in either HH or HT, but it’s no longer necessary for them to begin with a sequence of zero or more consecutive T’s.

Each of the following is a potential winning sequence in the two-coin case:

a) HT
b) HHT
c) HHHHHHHH
d) HTHTHTHTHTHTHTHTHTHTHTHTHH

Note the word ‘potential’. Whether these sequences actually win depends on what the other coin is doing.

As you can see, the winning sequences are quite different in the one-coin vs two-coin cases, so it makes sense that their average lengths also differ.

*Note that by this definition, a tie counts as a win for both tasks.

19. I just noticed a mistake in the preceding comment. There’s supposed to be a T at the end of sequence (c), like this:

HHHHHHHHT

https://www.wired.com/story/the-governments-new-weather-model-faces-a-storm-of-protest/

Apparently the new national forecasting algorithm has a “cold bias”.

Would you consider independently discovered evidence of this bias to be an artifact of personal choice? Ie the choice of one model over others.

If so how would you verify that your findings were not statistical noise from a small sample size?

peace

21. Seems to me you simply compare the performance of the several available models. My infallible method is to look out of the window.

22. Models can be biased for numerous reasons: incorrect algorithms, oversimplification, insufficient and/or inadequate input data, software bugs, poor parameter choice and so on. Since all models are man-made, such errors in the end tend to come down to human error and/or ignorance . You don’t need to do any analysis whatsoever to come to that conclusion.

On the other hand, there are statistical tests for deciding on sample size. It all depends your desired confidence level and interval. Obviously, the smaller the sample size the higher the risk of drawing incorrect conclusions from it.

In my work I used to deal a lot with (complicated subsurface) models and modelers. I was involved in one development of a large gas field in a Far Eastern country which shall remain unnamed. In this field, new wells were continuously being drilled to sustain the production, so there were hundreds of wells by the time I got involved. I did some analysis on well results vs. pre-drill predictions and discovered that the predictions from the Operator were very much biased to the negative side, rather than showing a balance of underprediction vs. overprediction. This is not a good basis to take business decisions on and so I challenged it. This challenge was not well received by the Operator.

I chatted about this with one of the local colleagues whom I had befriended. He confided to me, strictly off the record, that they tended to deliberately under-predict the well results. Their management was much happier when wells came in better than predicted than the other way round, and happy managers make for happy workers…

23. faded_Glory: Their management was much happier when wells came in better than predicted than the other way round, and happy managers make for happy workers…

local TV forecast is wetter than local newsprint forecast is wetter than weather.gov.

Weather.gov is unbiased (on the wet/dry axis).

It’s entirely rational: viewers are upset at unforecast rain. Unexpectedly sunny, not so much…
TV stations indulge in the most CYA.

0
24. faded_Glory: I did some analysis on well results vs. pre-drill predictions and discovered that the predictions from the Operator were very much biased to the negative side, rather than showing a balance of underprediction vs. overprediction.

That sounds like the sort of conclusion I often reach in my work. I’m fascinated with conclusively determining that these kinds of results are intentional verses being the result of some unknown problem with equipment.

I think this the sort of design detection that we could all get behind.

It very seldom is. 😉

I can’t believe I never read the book. I’ve picked it up several times but never took it home.

DNA_Jock: It’s entirely rational:

Does he offer any proof for his conclusions other than
“it makes perfect sense”?

peace

25. What Nate Silver wrote about was the calibration of the predictions of the NWS, The Weather Channel (not print media, my memory did not serve…) and local TV stations.
The researchers measured “when the forecast was for a 10% chance of rain, how often did it in fact rain.”
He did this for P(rain) forecasts of 0%, 10%, 20%…100% and plotted the resulting data for each category of forecaster.

Given your weather forecasting research project I am a little shocked that you haven’t read his book and every weather-related work he cites.
Hum.

26. DNA_Jock: I am a little shocked that you haven’t read his book and every weather-related work he cites.

My project is only incidentally about weather forecasts.

I chose temperature forecasts verses actual recorded temps only because they form numeric sequences and don’t have the same baggage associated with them as things like actual verses predicted mutation rate. The hope is to come up with a method that is useful for more than just weather.

Definitely on the list now

peace

27. DNA_Jock: He did this for P(rain) forecasts of 0%, 10%, 20%…100% and plotted the resulting data for each category of forecaster.

It seems like he is just offering a statistical/probabilistic conclusion rather than definitive evidence of intent.

peace

28. I think this the sort of design detection that we could all get behind.

This is not design detection. In this case I discovered that the ‘design’ (the prediction method) was flawed, not that it existed in the first place. I knew that already because it is clear that everything in this scenario is designed, except the subsurface that we are testing with the wells.

The question of human bias (intentional or accidental) in predicting natural states or events is quite fascinating. In oil exploration, when we take expensive business decisions on where to drill we want to eliminate human bias as much as possible. There are a number of methods widely in use, but I fail to see how they relate to the more fundamental question of deciding if something is designed or not.

29. I chose temperature forecasts verses actual recorded temps only because they form numeric sequences and don’t have the same baggage associated with them as things like actual verses predicted mutation rate.The hope is to come up with a method that is useful for more than just weather.

You are doing time series forecasting. There are well established techniques for this that you may want to check on, don’t reinvent the wheel.

30. faded_Glory: You are doing time series forecasting.

no I’m looking at time series forecasts and recorded measurements for recognizable patterns that point to personal choice being involved.

peace

31. faded_Glory: I knew that already because it is clear that everything in this scenario is designed, except the subsurface that we are testing with the wells.

That is a good point. As a Calvinist I know that absolutely every thing is designed including the subsurface you are testing. 😉

What I’m looking for is the points were we can definitively say this is were the “bias” presents it’s self.

peace

32. faded_Glory: I fail to see how they relate to the more fundamental question of deciding if something is designed or not.

I’m not addressing the question of whether something is designed, I take it for granted that everything is designed.

I’ts more about detecting the interaction points where design decisions meet empirical observation. That goes for human as well as non human designers

peace

33. Just hmmmmm….

Wow, that’s so cool. I’ve never had a cat that would do that, I don’t think.

34. walto: Problematic from the get-go.

Why exactly is that? I’m not trying to prove that everything is designed so there is no begging the question.

We all begin with prior philosophical beliefs in this matter.

A hard determinist believes that all conscious choice is an illusion and therefore nothing is designed in the way I understand the term.

I don’t think that would change the way the process I’m looking at proceeds in any way. The hard determinist would just categorize the phenomena in question as apparent rather than actual design.

peace

35. It seems obviously problematic to me. You could say, “as a result of my Calvinist background, I believe that…..”

But “as a Calvinist, I know” is cuckoo. It’s like, say, “as a Jew, I know that Moses had trouble with words, and matza is good with coffee.”

36. walto: It seems obviously problematic to me. You could say, “as a result of my Calvinist background, I believe that…..”

Do you think that it’s problematic for faded Glory to say

quote:
I knew that already because it is clear that everything in this scenario is designed, except the subsurface that we are testing with the wells.
end quote:

Rather than

“As a compatibilist or as holder of libertarian free will I believe everything in this scenario is designed, except the subsurface that we are testing with the wells.”

Because his was the comment I was responding too.

Because of his phlosophical/religious commitments he draws the line for “knowledge” at a certain place.

I just draw the line at another place.

peace

37. fifthmonarchyman: Do you think that it’s problematic for faded Glory to say

quote:
I knew that already because it is clear that everything in this scenario is designed, except the subsurface that we are testing with the wells.
end quote:

Rather than

“As a compatibilistor as holder of libertarian free will I believe everything in this scenario is designed, except the subsurface that we are testing with the wells.”

Not really, no. I guess your remark would have been OK if you’d dropped the Cavinist biz and just started it as he does with “I knew X already because it’s clear that Y” I mean, I might not have agreed with it, but it wouldn’t be off the wall in the way your remark seems to me to have been.

Because of his phlosophical/religious commitments he draws the line for “knowledge” at a certain place.

I just draw the line at another place.

I mostly agree with that perspective, but it doesn’t really get to why “As a Calvinist I know that….” is an irritating remark. Anyhow, as a non-Calvinist I know you are wrong about everything you’ve ever written at TSZ.

(See what I mean?)

38. walto: Anyhow, as a non-Calvinist I know you are wrong about everything you’ve ever written at TSZ.

(See what I mean?)

Not really. Your above comment just invites the question

How do you know that??

Of course if you ask me how I know that absolutely everything is designed I will respond with ……..revelation

😉

peace

39. walto: But “as a Calvinist, I know” is cuckoo. It’s like, say, “as a Jew, I know that Moses had trouble with words, and matza is good with coffee.”

Hmm, no. That isn’t cuckoo.

The core of knowledge is in knowledge of social conventions. It’s the “justified true belief” idea that is cuckoo.

40. Neil Rickert: Hmm, no.That isn’t cuckoo.

The core of knowledge is in knowledge of social conventions.It’s the “justified true belief” idea that is cuckoo.

“The core of knowledge is in knowledge….” isn’t awfully helpful. (As a philosopher I know this.) 🤦‍♀️

41. walto: (As a philosopher I know this.)

That is interesting.

Do you mean that philosophers are epistemologically privileged in some way or do you just mean that thinking deeply about stuff has given you the impression that Neil’s claim is not helpful.

I’m going to guess it’s the latter.

If so, do you know what question that invites? 😉

peace

