Sometimes very active discussions about peripheral issues overwhelm a thread, so this is a permanent home for those conversations.
I’ve opened a new “Sandbox” thread as a post as the new “ignore commenter” plug-in only works on threads started as posts.
But you make a good point that cryptos (and those of us supporting them) are partially responsible for those CO2 emissions and its impact on climate change
Welcome back, keiths.
And your mistake is a failure to read for comprehension.
I was explaining why dazz’s initial simulation gave the erroneous result that he reported “I run a quick simulation and I get 50-50”.
You might notice that there is no mention of average number of flips, so it is OBVIOUS that he is not “honoring the problem specification”. He’s got a whole different problem going on…
And yes, as you noted, the premature termination is outside of the spec. But you also wrote that it was not the “same coin/different coin” that caused his result.
This was wrong. You made this mistake because you failed to pick up on the fact that Bruce and dazz were modeling a competition.
For the problem, as specified, it matters not whether there is one coin or two: there is no “competition” nor “premature termination” and the average number of flips needed differs, 6 vs 4. See my intuitive explanation of the difference.
For the dazz’s ‘competition’ problem, the number of coins DOES matter.
If you have a single coin, you get 50/50, which is what dazz did. If you have two coins you get a different wrong answer, as I explained:
32% HH wins, 54% HT wins, and 14% ties.
So for dazz to get the wrong answer that he did, he needs premature termination AND A SINGLE COIN.
Please slow down and read what people write, not what you think they have written.
Comments like this one:
“But it doesn’t matter who believes what. In 50 years we’ll all be dead … “
And later:
“I don’t really care who believes what. Over time, as I said, in the long run we’re all dead. It doesn’t matter who believes what. The question is, who’s got the story right? I think the central story of the theory of evolution is wrong in a way that can’t be repaired, saving the properties of the evolutionary theory”
keiths, to Jock:
DNA_Jock:
No thanks to you. It was Mung who did the right thing by standing up to Alan and removing me from pre-moderation.
Slow down, Jock. That result isn’t erroneous. Each task does in fact win 50% of the time if a single coin is used and the spec is honored. This also happens when a single coin is used and the spec is violated by prematurely terminating the other task as soon as one of the tasks reaches its target.
No, it isn’t obvious, because a) if the spec is honored, the 50-50 result is expected, contrary to your belief; and b) the spec doesn’t forbid a competition, just as it doesn’t forbid the observation of the same coin for both tasks. What it does forbid is premature termination.
It didn’t become obvious that dazz was violating the spec until he wrote this:
Jock:
Here’s what I wrote:
My statement is correct. If you honor the spec and don’t terminate prematurely, then you get the correct average number of flips whether you use one coin or two.
Bruce, earlier:
dazz, also earlier:
keiths, also earlier:
Jock, now:
It was obvious that they were modeling a competition. Bruce’s phrase — “playing against each other” — left no doubt. But the problem wasn’t that they were modeling a competition — it was that they were prematurely terminating the losing task.
Right. That’s why I said this:
Jock:
Neither of those answers is wrong. They reflect the win/loss/tie stats you get when you implement the spec correctly, which are the same stats you get when you violate the spec by prematurely terminating the losing task.
No. You get the 50/50 result with a single coin whether or not you implement premature termination.
Let me summarize:
1. If one coin is used, you get the 50/50 result regardless of whether you implement premature termination. If two coins are used, you get the 32/54/14 result regardless of whether you implement premature termination.
2. On the other hand, if you implement premature termination, you get the wrong average number of flips regardless of whether you use one coin or two. If you honor the spec and don'[t terminate prematurely, you get the correct average number of flips (6 and 4 for for tasks 1 and 2, respectively) regardless of whether you use one coin or two.
3. The mistake dazz and Bruce made was not that they modeled a competition, nor that they shared the same coin between the two tasks. Their mistake was to prematurely terminate the losing task when the winning task reached its target.
I’m just an old man unversed in the ways of mathematical probability but let me ask Keiths did he look at the blog post I linked to and, if so, does he disagree with it?
@ Keiths
concludes the article. Is there a deep point we are missing?
Alan:
The result isn’t deep, but it’s interesting because it is so counterintuitive.
Alan,
I don’t see anything to disagree with. The post confirms what I told Jock above: namely, that premature termination is not allowed, that the average number of flips is six for task 1 (H H) and four for task 2 (H T), and that each task wins half of the time if you use a single coin.
QED
Alan:
Right. My disagreement was with dazz and Bruce (and now DNA_Jock), not with the author of that blog post.
So you are saying that not only did I make a mistake, but then I made a mistake about what that mistake was.
Hah! No doubt you are right again. Good ol’ KeithS, Welcome back.
Bruce,
Yep.
You can convince yourself of that by running a simulation that models a competition, but without prematurely terminating a task when the other task reaches its target. If you do that, you’ll find that you get the correct average number of flips to reach the target: 6 for the “H H” task and 4 for the “H T” task, regardless of whether you use one coin or two.
Using the same coin is not a problem, nor is modeling a competition. The mistake is to terminate a task prematurely when the other task reaches its target.
Thanks.
No need. Your arguments are convincing on their own.
I posted this links in the BB thread,. They speak to the above and so I will repost here for convenience:
https://medium.com/intuitionmachine/why-hasnt-evolution-invented-the-wheel-4de68f31ffb6
https://medium.com/intuitionmachine/the-curious-case-of-agi-without-a-body-32d5752dc14
Useful links. Thanks.
I don’t much care for that site, though.
I browsed through the first linked article.
When I tried the second one, it told me that I have to sign up for the site to read it. Okay, I cheated. I then opened in a private browsing window so that I would not know this was a second visit. And that worked.
Right, I forgot about the paywall. I do pay for that site, but most of the content is garbage writing by people who seem to be more interested in learning how to write than in doing the work needed to have something interesting, original, and valid to say. However, now and then there are interesting articles like the linked ones.
The author of the article has many other ones that strike me as correct, eg in their approach to AI, intiontion versus rationality, adaption not optimization.
“intuition”.
Sorry for the typos. WordPress seems to intermittently lock me out of editing, even if I sign off and on and/or exit all my browser instances.
Likely something to do with my various ad blockers or cookie settings.
Thanks for that summary.
Interestingly, you get different ‘wrong’ answers for {task 2, average to HT} depending on the number of coins you use, one or two.
Jock:
You also get different wrong answers for {task 1, average to HH} depending on the number of coins, though the delta is smaller than it is for {task 2, average to HT}.
Both facts make sense if you consider the types of sequences that are contributing to the statistics in the two cases.
Let’s define a winning coin-flip sequence as one via which a task reaches its target before or at the same time as the other task.* When premature termination is implemented, it is only the winning sequences that contribute to the statistics.
What does a winning sequence look like in the one-coin case? It always has the form “zero or more consecutive T’s followed immediately by the target subsequence, which is HH or HT.” Why? Because as soon as an “H” appears, one or the other task is invariably going to win on the next flip. If the next flip is an “H”, then task 1 wins. If it’s a “T”, then task 2 wins. Either way, the sequence terminates.
Thus, the following are all winning sequences in the one-coin case…
1) HH
2) THT
3) TTTTTTTTHH
4) TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTHT
…because each consists of zero or more consecutive T’s followed by one of the two target subsequences.
Now consider the two-coin case. Winning sequences still end in either HH or HT, but it’s no longer necessary for them to begin with a sequence of zero or more consecutive T’s.
Each of the following is a potential winning sequence in the two-coin case:
a) HT
b) HHT
c) HHHHHHHH
d) HTHTHTHTHTHTHTHTHTHTHTHTHH
Note the word ‘potential’. Whether these sequences actually win depends on what the other coin is doing.
As you can see, the winning sequences are quite different in the one-coin vs two-coin cases, so it makes sense that their average lengths also differ.
*Note that by this definition, a tie counts as a win for both tasks.
I just noticed a mistake in the preceding comment. There’s supposed to be a T at the end of sequence (c), like this:
I’ve got a question about this.
https://www.wired.com/story/the-governments-new-weather-model-faces-a-storm-of-protest/
Apparently the new national forecasting algorithm has a “cold bias”.
Would you consider independently discovered evidence of this bias to be an artifact of personal choice? Ie the choice of one model over others.
If so how would you verify that your findings were not statistical noise from a small sample size?
peace
fifthmonarchyman,
Seems to me you simply compare the performance of the several available models. My infallible method is to look out of the window.
fifthmonarchyman,
Models can be biased for numerous reasons: incorrect algorithms, oversimplification, insufficient and/or inadequate input data, software bugs, poor parameter choice and so on. Since all models are man-made, such errors in the end tend to come down to human error and/or ignorance . You don’t need to do any analysis whatsoever to come to that conclusion.
On the other hand, there are statistical tests for deciding on sample size. It all depends your desired confidence level and interval. Obviously, the smaller the sample size the higher the risk of drawing incorrect conclusions from it.
In my work I used to deal a lot with (complicated subsurface) models and modelers. I was involved in one development of a large gas field in a Far Eastern country which shall remain unnamed. In this field, new wells were continuously being drilled to sustain the production, so there were hundreds of wells by the time I got involved. I did some analysis on well results vs. pre-drill predictions and discovered that the predictions from the Operator were very much biased to the negative side, rather than showing a balance of underprediction vs. overprediction. This is not a good basis to take business decisions on and so I challenged it. This challenge was not well received by the Operator.
I chatted about this with one of the local colleagues whom I had befriended. He confided to me, strictly off the record, that they tended to deliberately under-predict the well results. Their management was much happier when wells came in better than predicted than the other way round, and happy managers make for happy workers…
Nate Silver talks about this effect for weather forecasts in The Signal and the Noise:
local TV forecast is wetter than local newsprint forecast is wetter than weather.gov.
Weather.gov is unbiased (on the wet/dry axis).
It’s entirely rational: viewers are upset at unforecast rain. Unexpectedly sunny, not so much…
TV stations indulge in the most CYA.
That sounds like the sort of conclusion I often reach in my work. I’m fascinated with conclusively determining that these kinds of results are intentional verses being the result of some unknown problem with equipment.
I think this the sort of design detection that we could all get behind.
It very seldom is. 😉
I can’t believe I never read the book. I’ve picked it up several times but never took it home.
Does he offer any proof for his conclusions other than
“it makes perfect sense”?
peace
fifthmonarchyman,
What Nate Silver wrote about was the calibration of the predictions of the NWS, The Weather Channel (not print media, my memory did not serve…) and local TV stations.
The researchers measured “when the forecast was for a 10% chance of rain, how often did it in fact rain.”
He did this for P(rain) forecasts of 0%, 10%, 20%…100% and plotted the resulting data for each category of forecaster.
Read the book.
Given your weather forecasting research project I am a little shocked that you haven’t read his book and every weather-related work he cites.
Hum.
My project is only incidentally about weather forecasts.
I chose temperature forecasts verses actual recorded temps only because they form numeric sequences and don’t have the same baggage associated with them as things like actual verses predicted mutation rate. The hope is to come up with a method that is useful for more than just weather.
Definitely on the list now
peace
It seems like he is just offering a statistical/probabilistic conclusion rather than definitive evidence of intent.
peace
That models are designed?
This is not design detection. In this case I discovered that the ‘design’ (the prediction method) was flawed, not that it existed in the first place. I knew that already because it is clear that everything in this scenario is designed, except the subsurface that we are testing with the wells.
The question of human bias (intentional or accidental) in predicting natural states or events is quite fascinating. In oil exploration, when we take expensive business decisions on where to drill we want to eliminate human bias as much as possible. There are a number of methods widely in use, but I fail to see how they relate to the more fundamental question of deciding if something is designed or not.
You are doing time series forecasting. There are well established techniques for this that you may want to check on, don’t reinvent the wheel.
No, that intentional actions leave observable effects.
peace
no I’m looking at time series forecasts and recorded measurements for recognizable patterns that point to personal choice being involved.
peace
That is a good point. As a Calvinist I know that absolutely every thing is designed including the subsurface you are testing. 😉
What I’m looking for is the points were we can definitively say this is were the “bias” presents it’s self.
peace
I’m not addressing the question of whether something is designed, I take it for granted that everything is designed.
I’ts more about detecting the interaction points where design decisions meet empirical observation. That goes for human as well as non human designers
peace
Just hmmmmm….
Wow, that’s so cool. I’ve never had a cat that would do that, I don’t think.
Problematic from the get-go.
Why exactly is that? I’m not trying to prove that everything is designed so there is no begging the question.
We all begin with prior philosophical beliefs in this matter.
A hard determinist believes that all conscious choice is an illusion and therefore nothing is designed in the way I understand the term.
I don’t think that would change the way the process I’m looking at proceeds in any way. The hard determinist would just categorize the phenomena in question as apparent rather than actual design.
peace
You would think this would be non-controversial.
It seems obviously problematic to me. You could say, “as a result of my Calvinist background, I believe that…..”
But “as a Calvinist, I know” is cuckoo. It’s like, say, “as a Jew, I know that Moses had trouble with words, and matza is good with coffee.”
Do you think that it’s problematic for faded Glory to say
quote:
I knew that already because it is clear that everything in this scenario is designed, except the subsurface that we are testing with the wells.
end quote:
Rather than
“As a compatibilist or as holder of libertarian free will I believe everything in this scenario is designed, except the subsurface that we are testing with the wells.”
Because his was the comment I was responding too.
Because of his phlosophical/religious commitments he draws the line for “knowledge” at a certain place.
I just draw the line at another place.
peace
Not really, no. I guess your remark would have been OK if you’d dropped the Cavinist biz and just started it as he does with “I knew X already because it’s clear that Y” I mean, I might not have agreed with it, but it wouldn’t be off the wall in the way your remark seems to me to have been.
I mostly agree with that perspective, but it doesn’t really get to why “As a Calvinist I know that….” is an irritating remark. Anyhow, as a non-Calvinist I know you are wrong about everything you’ve ever written at TSZ.
(See what I mean?)
Not really. Your above comment just invites the question
How do you know that??
Of course if you ask me how I know that absolutely everything is designed I will respond with ……..revelation
😉
peace
Hmm, no. That isn’t cuckoo.
The core of knowledge is in knowledge of social conventions. It’s the “justified true belief” idea that is cuckoo.
“The core of knowledge is in knowledge….” isn’t awfully helpful. (As a philosopher I know this.) 🤦♀️
Do you know the question that claim invites?
peace
I already said! I know it as a non-Calvinist!
That is interesting.
Do you mean that philosophers are epistemologically privileged in some way or do you just mean that thinking deeply about stuff has given you the impression that Neil’s claim is not helpful.
I’m going to guess it’s the latter.
If so, do you know what question that invites? 😉
peace