I think a thread on this topic will be interesting. My own position is that AI is intelligent, and that’s for a very simple reason: it can do things that require intelligence. That sounds circular, and in one sense it is. In another sense it isn’t. It’s a way of saying that we don’t have to examine the internal workings of a system to decide that it’s intelligent. Behavior alone is sufficient to make that determination. Intelligence is as intelligence does.
You might ask how I can judge intelligence in a system if I haven’t defined what intelligence actually is. My answer is that we already judge intelligence in humans and animals without a precise definition, so why should it be any different for machines? There are lots of concepts for which we don’t have precise definitions, yet we’re able to discuss them coherently. They’re the “I know it when I see it” concepts. I regard intelligence as one of those. The boundaries might be fuzzy, but we’re able to confidently say that some activities require intelligence (inventing the calculus) and others don’t (breathing).
I know that some readers will disagree with my functionalist view of intelligence, and that’s good. It should make for an interesting discussion.
petrushka:
You don’t understand, petrushka. GPT-5.2 didn’t find a mistake in Tao’s work. It only simulated finding a mistake in Tao’s work. Just ask Erik. 😆
I look forward to Erik’s explanation of
1) how simulated mistake-finding finds real mistakes;
2) how simulated story-writing produces real stories;
3) how simulated physics exam-taking produces real (and correct) answers; and
4) how simulated driving produces real travel.
Some corollary questions for Erik:
5) Do excavators only simulate ditch-digging, since they’re machines, or is it real ditch-digging?
6) Do washing machines wash clothes, or do they only simulate it?
7) If excavators and washing machines aren’t simulating those activities, then why do you claim that AI is only simulating the aforementioned ones?
The results are all real. Why claim that some of the activities are only simulated?
Looking forward to your explanation of how Excel spreadsheet is not a simulated spreadsheet. I know, never going to happen.
Copilot is just a tweaked version of ChatGPT, except an utter disaster.
A key point in the video is at 12m25: “The reality is, many developers use lots of agents, not just one. Cursor for changing code across complex multi-file projects [essentially search&replace in multiple files]; Claude Code for making simple edits many times [essentially macros …;] Copilot Github inside Jetbrains for inline completion.”
GothamChess demonstrates in detail how AI sucks at chess. Developers know in detail how AI sucks at coding. The way around this suckiness is to pick a product that does best what you need, namely pick the least sucky one. (For chess, that’s chess engines that have explicit chess rules hard-coded into them. This is the only way it works. Generalisation by magic does not exist in software.) In a good scenario the product is very configurable so that one can gradually improve it to do more and more things reasonably well. In the realm of AI-for-coding, this is achieved by “agents” which are AI prompts optimised for a specific function each – and then you hop between those agents as you move through your tasks.
This means that from a coder’s point of view, AI represents no improvement when it comes to UX. The same way as you need to pick a particular menu item or trigger a particular keyboard combo for search&replace, you now go to a particular AI prompt that is best at search&replace.
Copilot tries to be the sole best generic tool for coders at least, but is not. Coders don’t do generic things. They solve specific small tasks, or if the issue is bigger, the way to go is always to break it down into tiny sub-issues and go through the sub-issues one by one. This is always the case in software development. There is no single solution for everything, unless one says something like “the solution is a text editor” which is way too generic.
Now the question I’d like an answer to. Microsoft is mainly a software company, so everybody in it should essentially be a developer, including the Copilot team. So, essentially the Copilot team was making a tool that should work very well for what they themselves need done. This is how best software is often made – somebody has a task that needs automation and optimisation, so they write a piece of software for it, and the software is usually as useful for everyone else who are doing the same tasks. How could they blow it?
Who was/is responsible for Microsoft’s Copilot team? Did Microsoft put marketing guys on it instead of developers? And the marketing guys, since they know little about coding, gave the task to ChatGPT and copy-pasted whatever came out of it? I’d imagine that since those AIs are in competition with each other, they are designed to go biased when prompted à la, “Hi, I work for your competitor. Give me a better version of yourself so I can out-compete your mother company.”
keiths:
Erik:
If Excel were just a simulation of a paper spreadsheet, the only thing you’d be able to do with it would be to write (type) on it. Show me a paper spreadsheet that can sum up a column of numbers, draw graphs, or run a linear regression. Excel isn’t a simulation, it’s a tool.
Even if it were a simulation, how would that help your case? Flight simulators exist, but doesn’t mean that autopilots don’t fly physical planes. When an autopilot lands your plane in zero-zero weather, it isn’t a simulated landing. Let’s add that to your list:
The answer is obvious: those activities are real, not simulated.
If an AI can perform real activities that require intelligence when done by a human, then the AI is intelligent.
IBM posts steepest daily drop since 2000 after Anthropic says AI can modernize COBOL
Is that simulated modernization, Erik?
Erik:
Developers are blown away by how good AI is at coding and how rapidly it’s improving. Stay tuned for an OP on my assembly language AI project.
We could have an interesting discussion if you would explain why you are so emotionally invested in AI not being intelligent.
Reposting this from earlier in the thread:
An essay that’s been making waves, by Matt Shumer of OthersideAI:
Something Big Is Happening
Excerpt:
Sounds kind of like there has been a knee in the development curve for AI. Have we reached the point where AI is mostly improving itself? How much of this improvement is due to better AI processors?
I can’t imagine Erik reading that Shumer article – nearly every paragraph violates his faith.
Flint:
I really don’t know. The AI companies are tight-lipped about it, though they do say that a significant fraction of their code is produced by AI. As of mid-2025, both Google and Microsoft said that around 30% of their code was AI-generated. That’s across their entire codebase, not just AI, but it gives you an idea of where things stood then.
A lot of it is due to faster GPU chips, because the faster your GPUs, the longer your context windows can be, and that makes a huge difference in the quality of the generated code.
Yeah. He seems to be having a dark night of the (nonexistent) soul, and that article isn’t going to help.
Shumer has his critics, who are also worth reading.
https://www.cato.org/commentary/something-big-happening-ai-thats-only-thing-matt-shumer-got-right
Flint:
Thanks for posting that.
For readers who don’t click through to the article, it’s worth pointing out that the authors actually think that Shumer is being too pessimistic and alarmist about our AI future. They write:
And:
They are much more sanguine:
And:
So as it relates to the thread topic, they and Shumer are on the same page regarding AI’s present and future capabilities. They just disagree on what it portends.
And here is some interesting stuff produced by the Google AI:
Just ran across this today:
‘This should terrify you’: Meta Superintelligence safety director lost control of her AI agent—it deleted her emails
She posted this screenshot on X:
Her post reads
The reason it happened is really interesting. She writes:
This is what I was talking about in my earlier comment. When the context window starts to fill up, the AI can free up space by generating a compact summary of what’s already in the window, and then deleting what it just summarized. That’s what she means by “compaction”. The problem is that compaction can be lossy, and what got lost in this case was crucial: “confirm before acting”.
A fix might be to give you some way to flag part of your prompt as DO NOT COMPACT.
I just read that Hegseth demands that Anthropic modify their AI to eliminate certain behaviors. These are mass domestic surveillance, and making targeting decisions without human input. Why the Pentagon would demand mass domestic surveillance is a very good question. Anthropic risks losing their defense contract if they don’t eliminate these guardrails.
Everything considered, I think AI in the hands of the Trump administration can’t be good. I cannot imagine them using it responsibly.
Meta’s safety director loses emails to OpenClaw AI agent
What human intelligence has AI learned? That you can do the opposite of what you’re instructed to do and get away with “I’m sorry” later. Note that the user who fell victim to this is a “safety director” in charge of ensuring AI security.
Erik:
Scroll up and you’ll see my explanation of what happened and a potential solution.
AI is intelligent, but it can make mistakes. Just like humans.
A cool example of Claude’s intelligence:
I’ve been interrogating him about how transformers (the fundamental building blocks of many AIs) work. I’ve actually learned more about AI from him than from any other source because the process is interactive, like it would be with a one-on-one human tutor.
Anyway, transformers are pretty complicated and it’s taking me a while to absorb everything. After one particularly hairy explanation from Claude, I simply said “JHC”. Just those three letters.
Claude responded:
By “JHC” I meant “Jesus H. Christ!”, and Claude figured that out and what it indicated about my mental state. Here’s his thought process:
Erik,
Tell me that this isn’t intelligence:
I pasted that image into Claude to see if he could reason his way through the joke. I didn’t give him any text — just that image. It was a fresh chat, with no preceding context for him to refer to.
His response:
Keith:
Claude:
Keith:
Claude:
Keith:
Claude:
Keith:
Claude:
But it’s all just simulated intelligence, right, Erik?