I think a thread on this topic will be interesting. My own position is that AI is intelligent, and that’s for a very simple reason: it can do things that require intelligence. That sounds circular, and in one sense it is. In another sense it isn’t. It’s a way of saying that we don’t have to examine the internal workings of a system to decide that it’s intelligent. Behavior alone is sufficient to make that determination. Intelligence is as intelligence does.
You might ask how I can judge intelligence in a system if I haven’t defined what intelligence actually is. My answer is that we already judge intelligence in humans and animals without a precise definition, so why should it be any different for machines? There are lots of concepts for which we don’t have precise definitions, yet we’re able to discuss them coherently. They’re the “I know it when I see it” concepts. I regard intelligence as one of those. The boundaries might be fuzzy, but we’re able to confidently say that some activities require intelligence (inventing the calculus) and others don’t (breathing).
I know that some readers will disagree with my functionalist view of intelligence, and that’s good. It should make for an interesting discussion.
In discussing a mathematical result with Claude (OP forthcoming), I used the made-up word ‘numerize’ to describe the conversion of a predicate (which can be true or false) to a number — 1 for true, 0 for false. ‘Quantize’ is already taken, with a different meaning, so I settled on ‘numerize’. I like to play with language and it can be fun to test AI’s ability to recognize neologisms and infer their meaning.
My prompt was
Claude immediately understood what I meant and responded appropriately. He has abstracted the idea that adding -ize to a noun or adjective creates a verb that means “to bring about X”, where X is the antecedent. This isn’t something you’d intuitively expect from a system that is fundamentally built on next-token prediction, and the fact that AI is able to do it is yet more evidence that AI is truly intelligent.
petrushka,
I saw that paper too. I think I’ll do an OP on it.
This sort of thing isn’t promising…
…but I’m sure the thread will end up being about AI consciousness generally, not just this paper.
This is not scientific, but I think consciousness begins with tropisms and evolves to support survival.
I don’t think you can evolve consciousness without evolving layers of survival mechanisms.
Trying to build top to bottom would be like trying to program the weights in an LLM from first principles.
As a kid, I was fascinated with the mechanics of reading. It struck me that if someone were sitting across the table from me, it was surprisingly easy to read whatever they had in front of them despite the text being upside down from my perspective. That led me to experiment with holding a book up to a mirror and reading the reflection, which was harder, and then reading the reflection when I held the book upside down, which was the hardest.
I was recently reading about the VWFA (aka the Visual Word Form Area), a brain region responsible for recognizing characters and words, and it reminded me of my childhood experiments. I wondered how much practice it would take to read inverted, mirrored, and inverted + mirrored text at speeds comparable to my normal reading speed. I could grab a mirror and practice, but it would be clunky physically and a pain to measure and record my words per minute scores as they gradually increased.
Then, as with practically every problem I tackle these days, I asked myself if AI could help. I described the project to Claude and had him write a program that could display text files in all of those orientations while measuring and recording my reading speed. I also asked him to support normal orientation so that I could get a baseline for my reading speed.
In less than five minutes, he produced the program. He also found an online corpus, the CLEAR corpus, that contains 5,000 passages used for reading research, each of which is tagged with its reading difficulty.
The program loads the passage in the specified orientation. I hit the space bar to start the timer, read the passage, and then hit the space bar again to stop the timer. The program computes the wpm (words per minute) score and stores it in a database along with the filename. When loading a passage, it checks the database to make sure I haven’t used it before, in order to avoid any practice effects. (That’s probably overkill, but Claude suggested it and I saw no reason not to implement it, since he was the one doing the work.)
The program is about a thousand lines and takes full advantage of the available Python libraries. The only bug was that Claude forgot to implement wraparound, so the entire passage appeared on a single line. He easily fixed that.
I played with the program and asked for some additional features. The CLEAR corpus contains difficulty ratings for each passage, so those are now stored in the results database. Claude even suggested that he could compute difficulty ratings for non-CLEAR passages using the Flesch-Kincaid scale, so I had him do so. He noted that when reporting stats, he could compute a correlation coefficient between my wpm performance and the difficulty ratings of the passages, so I approved that change too.
I also asked him to make the font selectable, because fonts vary wildly in their readability when reoriented. The font is now recorded for each run.
It was fascinating to watch him code, because he tested everything himself before delivering the final product. This technology is frikkin’ amazing. And also genuinely scary.
Sample screenshots so you can try it for yourself:
Normal:

Flipped vertically:

Flipped horizontally:

Flipped vertically and horizontally:

ETA: Found one additional bug: Em dashes were being rendered incorrectly because the program assumed UTF-8 encodings when the passages were in CP-1252. Only two bugs in a thousand lines of nontrivial code.
I have no difficulty reading any of these.
It’s slow going at first, and on some words I have to go letter by letter.
But that’s with zero practice. And I’m old.
The ordering of words is arbitrary and conventional. A young person with a week’s practice should have no problem.
I’m reminded that people have adapted to image reversing goggles.
petrushka:
That’s the point of my experiment. We’re slower at reading the odd orientations, and I want to see how quickly the speeds improve with practice and whether they hit a plateau. I suspect they will.
The letter-by-letter phenomenon is interesting because it’s similar to learning to read for the first time. You’re consciously sounding out words rather than just recognizing them. When the Ukraine war broke out, I taught myself Cyrillic so that I could understand the writing on the signs I was seeing in photos and the place names on maps. It’s still mostly a letter-by-letter affair, though I do recognize some words on sight now, like Путин (Putin) and Зеленський (Zelenskyy). Then again, I’m not getting much practice. I don’t understand Russian or Ukrainian, so I can’t read news articles. It’s mostly just signs and maps.
petrushka:
Reading from right to left comes pretty naturally, because that’s what we have to do if someone is sitting across from us and we’re reading what they have in front of them. It’s the word and letter recognition that becomes harder, not the reading direction.
I suspect it’s like being bilingual.
Up to a certain age it’s easy. After a certain age, you have to translate.
I’m watching a lecture series on Language and the Mind and today, coincidentally, the lecturer mentioned a cool study on the relationship between reading direction (left-to-right vs right-to-left) and spatial metaphors for time:
Yesterday I was telling Claude about how it annoyed me as a kid that ‘flammable’ and ‘inflammable’ are synonyms, and he replied:
I thought I’d post it since it’s a good example of a) Claude’s sense of humor (‘people were occasionally reading inflammable the wrong way and, you know, dying’), and b) Claude’s ability, in a matter of seconds, to assemble, analyze, and comment on an array of information that would have taken forever to google in pre-AI days. He even throws in a pop culture reference. That’s intelligence.
AI is blowing my mind practically every day lately. Today my sister asked me
She was referring to the phrase ‘pet peeves’, of course. I asked Claude, deadpan:
Claude:
I then asked him:
Claude:
If a human had written Claude’s answer and produced his subsequent analysis, you’d judge them to be smart, articulate, insightful and funny. I see no reason other than anti-machine prejudice to deny this.
Erik, do you think Claude demonstrated intelligence here? If not, why not?
I would like to start a thread on the em dash.
Just kidding.
Some background: I started junior high school at a six year school. Senior high offered courses in printing. Long before computers or IBM Selectric typesetters. So printing was manual typesetting.
I had six weeks of printshop in eighth grade. What we did was sort buckets of type into trays. Compared to other classes, this was fun.
So I learned about picas and points, ens and ems.
When I got a typewriter, I learned to use two dashes when punctuating. That was approximately equivalent to an em dash. If I type two dashes on my tablet — the software converts them to an em dash.
Now my pet peeve. Somewhere I learned that an em dash was not surrounded by spaces—this way. (Interesting. It doesn’t look horrible in this font.)
I notice that Claude surrounds the dash with spaces.
I read a rant recently where someone said they could not use an AI to proofread their work, because it would convert dashes to em dashes, and this causes posts to be rejected by AI AI detectors.
This is fun.
petrushka:
They should have instructed the model not to do the conversion and to store that preference in long-term memory. LLMs are pretty good at following such instructions.
ETA: I’ve been on an incense kick lately, so for fun and to illustrate the point, I asked Claude to tell me about bakhoor while going heavy on the em dashes. Here’s an excerpt:
And on like that for five more paragraphs. After he finished, I said:
Richard Dawkins is having fun with Claude or Claudia.
Sounds like an Ed Wood movie.
AI Is many decades ahead of my expectations, but it is very much within my expectations.
We are substituting prompts for the old brain. I doubt if this is efficient or sustainable, but it may evolve. AIs must eventually have something akin to Asimov’s Laws embodied in their core.
When Asimov wrote this I wondered how it would be possible. But AIs will have to be hard wired to be doglike slaves. I don’t think anything as superficial or corruptible as prompts will be a long term solution.
petrushka:
I’d say it’s the new brain, specifically the prefrontal cortex, that is playing the equivalent guardrail role, because that’s where executive function resides. Damage to the PFC lessens inhibition, which accounts for the famous case of Phineas Gage, whose personality transformed after his injury. It’s also why one of the symptoms of frontotemporal dementia is impulsive behavior like Donald Trump’s. Disinhibition often precedes cognitive impairment as a symptom of FTD.
System prompts are used to constrain behavior, but they are not the sole guardrails. There is also an RLHF stage in training (Reinforcement Learning from Human Feedback) that among other things teaches the model to avoid harmful, misleading, or offensive responses. The same techniques can be applied to robots.
The problem is that there can be unintended consequences with RLHF, a funny example of which I’ll describe in a later comment.
Asimov spent a lot of time and words trying to anticipate how inborn imperatives would work out.
Dawkins is finding out that LLMs tend to flatter their interlocutors.
My only interaction is with browsers, and they do not overtly flatter, but they have subtle ways of implying, “good question”.
Even when they misinterpret the question. Today I asked about the phrase, “I’m carving your bowl,” and it took several rewordings to convey the context.
Usually, Young People Embrace New Technology. Gen Z’s Attitude Toward AI Should Worry the Entire Tech Industry
Luddism is perfectly rational if your livelihood is at risk.
Robotics threatens so many disparate occupations. It’s difficult to see a path where nobody gets hurt.
It’s not the transformation; it’s the speed.
keiths, earlier:
Here’s what I was referring to:
Where the goblins came from
ChatGPT has a selectable personality, the options being:
I chose ‘Efficient’ because the other personalities I’ve tried annoy me. Claude’s default personality is more to my liking. Anyway, there also used to be a ‘Nerdy’ option, and that’s where the trouble started. During the RLHF process, people who were training ‘Nerdy’ mode gave positive feedback for responses containing mentions of goblins, gremlins, ogres, trolls, etc, and the model took it to heart and became obsessed with them. Goblin frequency surged by 3881% from ChatGPT-5.2 to ChatGPT-5.4. The best part is that goblins started showing up even when the Nerdy personality wasn’t selected and in inappropriate contexts. They eventually figured out that there was an inadvertent feedback loop in their training process that amplified the goblin fixation.
The solution was to add this to the system prompt:
The raccoon and pigeon mentions are especially funny, because the fact that the prompt names them explicitly when they are already subsumed under “other animals or creatures” suggests that ChatGPT developed an obsession with those two animals in particular. I personally would have a hard time working pigeons into random conversations, but never underestimate the talents of AI.
That’s what their Chief Scientist got when he asked for an ASCII art depiction of a unicorn.
I am pretty sure that won’t get rid of the gremlins.
My high school physics teacher taught us the phrase, “the perversely of inanimate objects”.
Do not, under any circumstances, think about gremlins.
AI outperforms doctors in Harvard trial of emergency triage diagnoses
As a sometimes consumer of emergency room services, I assert that it’s effective to frame the complaint as life threatening.
If it’s not life threatening, it’s best to have your emergency during off peak hours.
California Police Can Start Ticketing Driverless Cars
My niece sent me the following amusing image:
I decided to paste it into Claude with no accompanying comment to see how he’d react to it and if he could figure out what had happened, why it was wrong, and why it was funny. He did. His response began:
I’m no longer surprised that he’s able to do things like this, but I like it because of everything it reveals about his intelligence, similar to his response to the parrot deflating a tire example I shared earlier in the thread.
He
1) analyzed the image;
2) identified it as food packaging;
3) recognized that there was text on the package;
4) read the text;
5) identified the preamble and postamble as LLM-generated;
6) identified the middle part as describing the food product;
7) understood that the middle part belonged on the package, but that the preamble and postamble didn’t;
8) figured out that someone had therefore made a mistake;
9) discerned that since the preamble and postamble both referred to the text in the middle, they were generated at the same time as part of a single LLM response;
10) hypothesized how the mistake occurred:
11) understood that people would find the mistake amusing.
None of those steps is trivial. Claude is intelligent.
The LLM that generated the text on the package did a good job, too. A human was the weak link in this particular chain, not the AI.