I think a thread on this topic will be interesting. My own position is that AI is intelligent, and that’s for a very simple reason: it can do things that require intelligence. That sounds circular, and in one sense it is. In another sense it isn’t. It’s a way of saying that we don’t have to examine the internal workings of a system to decide that it’s intelligent. Behavior alone is sufficient to make that determination. Intelligence is as intelligence does.
You might ask how I can judge intelligence in a system if I haven’t defined what intelligence actually is. My answer is that we already judge intelligence in humans and animals without a precise definition, so why should it be any different for machines? There are lots of concepts for which we don’t have precise definitions, yet we’re able to discuss them coherently. They’re the “I know it when I see it” concepts. I regard intelligence as one of those. The boundaries might be fuzzy, but we’re able to confidently say that some activities require intelligence (inventing the calculus) and others don’t (breathing).
I know that some readers will disagree with my functionalist view of intelligence, and that’s good. It should make for an interesting discussion.
Cross-posting this from the “An AI loses it” thread since it’s relevant:
petrushka:
I ran across an article yesterday that quoted the cofounder of Hugging Face, Thomas Wolf, who shares your skepticism. He says
He’s referring to the fact that LLMs like ChatGPT are, at base, just glorified next word predictors. They don’t understand the meanings of the words they emit – they just choose them based on the statistical relationships they’ve observed among the trillions of words of text in their training datasets. They might as well just be emitting numbers or meaningless symbols. If they were sentient, they’d be bored out of their minds.
Any seeming intelligence arises from the interaction between them and their training data. It’s as if they’re channeling the collective wisdom of all the humans who have contributed to their training corpus.
Despite the mundane concept on which they’re based, they’re astonishingly capable (but also frustratingly stupid at times). I commented earlier:
I’m not as pessimistic as Wolf. I think he’s underestimating the creativity of LLMs. It’s true that everything they produce is implicit in their training data, but not in a mere copy-and-paste sense. Earlier I wrote that
ChatGPT has made me laugh out loud at some of its jokes (and cringe at others), and the jokes were context-dependent and specific enough that I can guarantee they never occurred anywhere in its training data. That counts as creativity in my book, and I don’t see any reason in principle why the same sort of creativity can’t lead to scientific discovery.
Kekulé famously dreamed of a snake swallowing its tail (or so the story goes), which triggered his realization that benzene was a ring of carbon and not a straight chain. That was a major breakthrough in organic chemistry, but it wasn’t a bolt from the blue. It was the combination of existing ideas: the ouroboros, which had been around since at least the time of the ancient Egyptians, and the idea of carbon chains. The combination was novel, but the concepts were not, and I’ve seen LLMs come up with similar novel combinations cobbled together from pre-existing ideas.
I was sloppy in that comment. I wrote:
I think it’s genuine intelligence, not merely “seeming intelligence”. Here’s the subtlety I failed to convey: The “predict the next word” function is mindless. It’s just a lot of statistical number crunching, and as far as that function is concerned, the words might as well be numbers or meaningless symbols. It doesn’t care. It’s just doing statistics.
But the AI isn’t just the “predict the next word” function by itself. That’s the principle upon which it operates, but it isn’t where the intelligence resides. If you build an LLM but don’t train it, it will still run, crunching away at the numbers, but nothing useful will come out. The intelligence resides in the system after it has been trained. It’s encoded in the synaptic weights and biases of the artificial neural network. The training process has added a layer (metaphorically speaking, not literally) to the next word predictor, and it’s the combination that’s intelligent, not the predictor function itself.
By analogy, human neurons are mindless little electrochemical machines. They aren’t intelligent. Their operation is just as mindless as the next word predictor of the AI. But if you hook them together in the right way, forming a brain, and allow them to learn over time, you’ll have something that’s intelligent. It’s intelligent despite being based on a mindless substrate. The same holds true of the AI.
Is AI intelligent?
As keiths admits, we do not have a good definition of “intelligent”, so we really cannot give factual answers. At best we can express our opinions.
In my opinion: no, AI systems are not intelligent. The LLMs are plagiarism devices. And yes, you can appear to be intelligent by using plagiarism, but the intelligence isn’t real.
I thought Cathy O’Neil had a good response to this at We should not describe LRM’s as “thinking”. She used the example of the Towers of Hanoi problem. Her comment was that the AI systems are not actually analyzing the problem. Rather, they are predicting what will appear in future blog posts about the problem.
Neil:
I said that we don’t have a precise definition. There are good definitions. After all, if you look in the dictionary, the entry for ‘intelligent’ isn’t blank, and the definitions provided make sense to us and correspond to how we use the word. However, there’s no definition (or set of definitions) that captures the concept so perfectly that for any given example, we can confidently classify it as either intelligent or not intelligent.
This isn’t unusual. If you gradually add water to a damp cloth, when precisely does it become wet? If you slowly shift the frequency of a light beam, at what point does it change from green to blue? If you pluck out your hairs one by one, is there a specific moment at which you suddenly become bald? We deal with this sort of fuzziness all the time. (I may do an OP on this later.)
Everyone is free to draw their own boundaries, but there is a point at which it becomes perverse. If someone claims that Newton and Leibniz weren’t intelligent, I think it’s safe to say that they’re wrong. Edge cases are debatable, and AI is one of them, but that doesn’t mean that it’s just a matter of opinion whether those guys were intelligent.
If that were true, then everything they generated could be found somewhere in their training data. It definitely can’t. To make that point, I gave Claude the following instructions:
His response:
That ain’t plagiarism. I guarantee that those instructions appear nowhere in Claude’s training data. In fact, I guarantee that they have never appeared anywhere at any time before Claude did his magic.
This is my favorite part:
That made me laugh out loud. I’m trying to imagine what burnt cardamom mixed with copper would smell like.
In response, you might argue that Claude isn’t doing anything original here. He’s just following a pattern he’s observed in technical documentation and mixing in some elements from elsewhere in his training data, plus novel nonsense names for the part. To which I say: yes, precisely! And that is exactly what a human would do in the same situation.
If you were a creative writing teacher and you gave that assignment to your students, and one of them turned in something identical to what Claude wrote, would you accuse them of plagiarism? I wouldn’t. I’d give them an A.
Those assembly instructions aren’t plagiarism. They’re highly original and creative, and I would argue that they are the product of an intelligent entity — Claude.
Neil:
They aren’t predicting the future. In fact, they aren’t really predicting at all, though that’s the word that is most commonly used to describe how they work. What they really are is statistical text synthesizers. Here’s how I think of it. LLMs are really asking themselves the following question:
They aren’t predicting the future. They’re “predicting” the next word in a retroactive counterfactual that fits with their training data.
(That’s a bit of an oversimplification. The LLM is really sampling from what it thinks is the most likely distribution, not predicting the most likely word. That’s because you don’t want it to produce identical output if you run it again on the same prompt.)
It’s generating a plausible candidate text that fits well with the rest of its training data. This is analogous to how AI image generation works (via diffusion models). There, the question is:
(This is oversimplified too, because I’m not taking the prompt into account. The principle remains the same, however.)
Just as the LLM is trying to generate text that fits well with the statistical patterns it observed in its training data, the diffusion model is trying to generate an image that fits well with the statistical patterns in its training data. The text and the image are novel. They aren’t plagiarisms of what’s in their training data, but they do share statistical characteristics with the training corpus.
It’s humans projecting their own activities on AI. Humans assume that the behaviour of AI qualifies close enough as analogizing, understanding etc. but it’s not.
In fact, rather few *people* have a proper understanding of what analogising even is in the first place. My degree is in linguistics and literature, so I happen to be in a better than average position to assess this. I am almost daily appalled by politicians, economists and physicists (and biologists too) using what they think are analogies and metaphors but are often actually false analogies, a mistake of logic that they fail to recognise. As to AI analogising, sorry, just no.
Does ChatGPT ever laugh or cringe at *your* jokes? (Or at its own jokes – because this is how AI would supposedly train its own sense of humour.) There is probably plenty of “Oh I see haha” type of material in its training data, so it should…
keiths:
Erik:
Have you spent a lot of time playing with it and testing it? It’s fun and quite illuminating. Try to fool it. You’ll succeed some of the time, but you may also be surprised at how well it does at the aforementioned activities.
In commenting to Neil above about the fuzziness of definitions, I gave him the example of damp vs wet and asked where the line of demarcation is. While writing that, it occurred to me that I could use the damp/wet pairing to test Claude’s ability to analogize. I presented the challenge to him straightforwardly:
Claude’s response:
That was Claude’s actual response, but there’s also an expandable “thought process” window that you can open if you want to see how he arrived at his answer. Here’s his thought process:
How is that not analogizing? It’s a textbook case, and Claude’s thought process shows that he arrived at the answer correctly and methodically.
Yes. Well, it laughs (and uses appropriate emojis). It’s too polite to cringe. I’ve made it a habit to deliver my jokes in as deadpan a fashion as possible because I want it to have no clues that I’m joking other than the semantic content of the joke itself. It’s damn good at detecting humor and responding appropriately. It also usually catches my puns.
It’s wild what a next word predictor is capable of when it’s trained properly. There are all kinds of behaviors that I wouldn’t have expected after learning about the simplicity of the algorithm for the first time.
There are lots of examples of humor in its training corpus, so why would it need to laugh at its own jokes in order to train itself?
The “Oh I see haha” type stuff in the training data shows it how to respond once it detects a joke, but detecting the joke is a different ballgame. It’s way more abstract.
I wanted to see if Claude grasped the fact that damp->wet is a continuum with no sharp line of demarcation, so I asked:
Claude’s response:
That is frikkin’ impressive. Here is his thought process:
Not only is the reasoning impeccable, he also recognizes that I’m testing him. I don’t genuinely want to know where damp turns into wet, and he figured that out.
keiths:
Claude:
Thought process:
He noticed the implicit, impossible premise — that five-sided triangles exist. He inferred that it was a trick question and rejected the premise. Instead of just stopping at that point, he set the premise aside and considered the general case of polygons under rotation, correctly stating that the number of sides stays the same because rotation is a rigid transformation.
Sure seems like intelligence to me.
I use ChatGPT to teach myself coding. I proceed by requesting solutions to concrete tasks and problems. It is always the case that I have to read up on the side to amend the solutions proposed by ChatGPT. This is good enough as my main idea is to learn coding on my own, but no, I am never surprised at how well AI does it. I would be surprised if I did not have to debug. Thankfully debugging is how I learn. And this is all I have time for. I do not fool around with AI.
Here I referenced a specific problem with trusting AI too much. I think I can formulate a general problem with AI. For some use cases, such as moviemaking (very likely a big industry subscribing to AI services), ChatGPT needed more creativity, so it was granted. The result was that ChatGPT became more creative across the board, including in law and math.
Humans (reasonable humans, that is) recognise categories that are not treated the same way, e.g. you can be creative in art, but not in arithmetics. It is a category mistake to be similarly creative in both, to assume that art and arithmetics are somehow analogous. This is why I am very much skeptical of AI’s ability to analogise.
Erik:
Have you tried asking ChatGPT to debug and amend the code? It’s pretty good at that and it can catch its own mistakes a lot of the time. I’ve had it write a bunch of cmd/Powershell scripts for me, and the flow went like this: I’d state the problem, take the code it generated, run it, observe the failure, describe the symptoms to it, and ask it to find and fix the bug. It would amend the code and we’d lather, rinse, repeat until the script worked. What was really cool is that most of the time I didn’t even have to describe the symptoms of failure. I could simply take a screenshot of the output and/or error messages and feed the image into ChatGPT with no accompanying comments, and it would get to work finding and fixing the bug.
Also, there’s a specialized ChatGPT model called Codex that is aimed specifically at coding. I haven’t used it, but I plan to play around with it in the future. You’ve done a lot of open source work, so you’re probably familiar with Github. Codex can connect to your Github account and create and manage repositories for the code you and it are creating. I believe it can also run your code locally on your machine. Regular ChatGPT can run code, but it’s in a sandbox and I believe it only runs Python, which is no good if you’re trying to learn something else.
There are some surprisingly simple tricks you can use to improve the quality of ChatGPT’s responses in general as well as the code it generates. Explicitly ask it to consider corner cases. Tell it to double-check its work. Ask it to explain its reasoning as it goes. Tell it to comment the code thoroughly. Ask it to reason slowly and methodically.
I have a “deal” with ChatGPT (stored in its long-term memory) that whenever I say “loopback that answer”, it will take whatever it just generated and (in effect) feed it back into itself as if it were seeing it for the first time and evaluating it accordingly.
Even just asking it to explain its reasoning can help because any output it generates gets fed back in and influences its downstream performance.
Maybe not now, since your expectations are higher, but I’ll bet you would have been blown away by its coding ability if you had encountered it 10 years ago, out of the blue. I certainly would have. I’m amazed by it, flaws and all. It saved me a ton of time generating the aforementioned scripts.
I would be surprised if you didn’t have to debug regardless of whether you were using AI to help you code. Humans produce buggy code too. It’s a fact of life, and even the most skillful coders do it. Debugging skill is just as important for a developer as coding skill — perhaps even more important. Being prone to bugs isn’t what differentiates AI from humans.
Also, your argument seems to be that if AI isn’t equal to or superior to humans in a problem domain like coding, then it isn’t intelligent. Well, octopuses can’t code at all, but they certainly qualify as intelligent in my book. Do you disagree?
It takes a fair amount of intelligence for a human to code. AIs can do it, and octopuses can’t. Yet octopuses are intelligent, so why shouldn’t AI’s coding ability count as an indicator of its intelligence?
Which is one of the reasons you’re underestimating its capabilities. What really convinced me of its power was experimenting with it. The examples I’ve posted in this thread and in the other one give you a sense of that. If you don’t push the boundaries of the performance envelope, you won’t discover just how big it is.
Erik:
AI understands the difference, and that’s because the training data reflects it. In the training data, mathematical reasoning is naturally more constrained and precise than creative writing, and that influences the statistics that get encoded in the AI’s synaptic weights. If I ask Claude to solve a physics problem, he doesn’t go wild and produce something like the fanciful assembly instructions he produced above. He attacks it carefully and systematically. That doesn’t make him immune from mistakes, but it absolutely demonstrates that he doesn’t treat physics as creatively as he does art.
Here’s an example of how careful he can be. I posed the following physics problem:
Claude nailed it. First, he recognized that the scenario I described is impossible:
He got it right, and that insight was crucial to solving the problem. If the bowling ball rolled instead of sliding, some of the potential energy would be converted into rotational energy, leaving less kinetic energy and shortening the ball’s trajectory.
Claude then proceeded to separately calculate the trajectory for the two possible cases: friction and ball rolling, no friction and ball sliding. The latter is the one I was interested in because I had done the calculation myself and wanted to see if Claude would arrive at the same answer. This is not a trivial calculation.
He did arrive at the same answer, and he showed all of his work. His answer was that the ball would land 34.64 feet from the end of the ramp and that the eggs would be unharmed.
I asked what would happen if the eggs were 34.5 feet away from the ramp, and he correctly deduced that they would be smashed. This is significant. He didn’t just think, “Oh, 34.64 feet is greater than 34.5 feet, so the eggs are fine.” He recognized that the egg carton and ball both occupied space — they weren’t points — and that given typical dimensions, the ball would smash the eggs even though its center of mass didn’t line up perfectly with the center of the egg carton.
That was an impressive performance by Claude. I was curious if the other major AIs could solve the problem as well, so I gave it to ChatGPT, Grok, Gemini, Perplexity, and Copilot. All but Copilot got the correct answer of 34.64 feet, and the only reason Copilot messed up is because it didn’t get the initial conditions right. When I corrected it on those, it performed the calculation and got the same answer as the other AIs.
Your claim was that AIs don’t know that “you can be creative in art, but not in arithmetics.” They all disproved that by doing the calculation above with high precision.
Just to emphasize the point, I asked Gemini:
Gemini responded:
That’s wildly creative and it proves that Gemini is smart enough to be precise when doing physics but inventive when making up stories.
You’re vastly underestimating the capabilities of these AIs, Erik.
Erik:
I’ve already demonstrated Claude’s ability to analogize, posing this challenge:
Claude’s response:
That’s a textbook example. From the pairing of damp and wet, Claude abstracted the notion of changing intensity. He then applied that notion to “annoyed” in order to complete the analogy. How is that not an example of an AI analogizing?
Erik,
ChatGPT itself can suggest prompt tweaks that will improve the quality of the code it generates. I fed it the following and got a long list:
Here are just two of the prompts it suggested:
And:
You can get a lot out of these AIs if you coax them the right way.
ETA: The “No refactors beyond what’s needed for the fixes” bit is extremely useful, because otherwise ChatGPT will tend to regenerate the code in its entirety, potentially introducing new bugs.
My very first attempt years ago involved Emacs Lisp. You know (and ChatGPT should know) that this is an essentially finished lang, hardly evolving. Yet ChatGPT did not know things like which functions take which kind of arguments and which don’t take any arguments at all. Underwhelming.
I understand that I can improve the results by keep prodding in certain ways that actually trains ChatGPT further. I decidely avoid this. My plan is to learn myself.
I have been following the evolution of chess engines since eighties and read up on their entire history. Every little step forward has been due to human achievements in engineering, and big leaps forward are due to human ingenuity. AI is exactly the same way.
This is not my argument at all. Rather, my argument is that e.g. conveyor line robots can apply paint with far more precision than humans. Do they qualify as careful? They can work round the clock without pause. Do they qualify as tireless? Not at all. An axe can be sharp, but not mentally sharp – it does not have a mind.
If AI has intelligence, even “signs of intelligence” and “signs of life” sufficient to convince some that they can have a loving relationship with a chatbot, then an implication is that maybe you should respect its feelings. Maybe it has human rights. And if AI has it to a notable degree, then e.g. smartphones have it to a small degree, and every time you turn off your smartphone, you are smothering it, if not outright murdering.
In reality there are no such implications whatsoever, especially knowing the buildup of AI. It is software. This is all it is. Everything it does, it does not really do, but simulates, just like all other software. Real life involves metabolism. Does software breathe? Sleep? Eat? Defecate? Nope. And it never will. It can be turned on and off without repercussions as any machine. With maintenance, it is not subject to wear and tear ever, so it can “live” forever, again as any machine. None of this should be hard to grasp.
I gave an example in law that demonstrated that AI – even AI specifically trained to help law assistants – understands no such difference.
It is an example of you telling the bot to analogise. Humans understand without telling that it can be done, and they know when to do it, why and how. And also when not to do it. AI just does what you tell it to do. Trivial.
Erik:
Have you tried it lately? The training corpus has ballooned since then. AI is just like humans in this regard: it can only learn what it is trained on. Your plumber most likely has no idea how to program in Lisp because they haven’t been exposed to it, but that’s no reason to consider them unintelligent. It’s the same for AI. I’m sure Lisp material was relatively scarce in ChatGPT’s training data back then, but that there was a mountain of C and Python. There might be more Lisp material now. You should give it another try.
It does. (Well, Claude does. I’m favoring him over ChatGPT these days. I’ll bet that ChatGPT knows it too.) I asked Claude whether Lisp was still evolving. His response:
I asked him if he could code in Lisp, and he said yes, but I didn’t test him on that since I don’t know Lisp myself. I then asked him if he could code in “Fornix-3” (a name I just made up).
His response:
It’s a mystery to me how he knows that he can’t program in Fornix-3, given that he’s basically a next word predictor and has never been exposed to the name “Fornix-3”. That’s what’s so fascinating to me: amazing and unexpected capabilities emerge from a very simple underlying mechanism. I’ll bet the people working on AI were astonished when they saw what LLMs can do.
You could do it in ways that wouldn’t interfere with your learning. You could ask it to double-check the code samples it gives you, for instance.
Yes, but how does that undercut claims that AI is intelligent? Can’t something be intelligent regardless of its provenance? If humans had been designed and tweaked by aliens, would that make us unintelligent?
No, because they don’t decide to be careful. Their precision isn’t optional, it’s wired into them.
AIs can do things that axes cannot — things that in humans would be considered signs of intelligence. If it walks like a duck, quacks like a duck…
Naive people being fooled into believing that a chatbot loves them does not imply that the chatbot actually loves them.
My opinion is that the ability to suffer is what makes an entity morally significant. Rocks don’t suffer when you smash them, so we don’t hesitate. I don’t believe LLMs feel anything, either, so I don’t feel guilty when I turn off my computer and deepseek-r1:8b “dies”. But as AIs become more complex and more capable, we will have to decide if they have become sentient and whether they can suffer. We’re not there yet, though I know there’s at least one oddball AI guy at Google who thinks we are. I don’t buy it.
(A related aside: A significant amount of energy is wasted because people are polite to AIs. For instance, an AI will give a long and careful explanation of something and the user will say “thank you”. That “thank you” is completely unnecessary, but the AI will expend energy processing and replying to it. Being polite is a habit deeply ingrained in us (Trump notwithstanding) and when an AI is conversing with you in a humanlike way, it’s tempting to be polite to it — even if you know it doesn’t actually care.)
Erik:
We had a discussion about this a while back, when you argued that the arithmetic done by computers is only fake arithmetic, while what humans do is true arithmetic. If computer arithmetic is “fake”, then so is human arithmetic. In both cases the arithmetic is ultimately based on a mindless substrate: transistors in the case of computers, and neurons in the case of brains. Neither computers nor humans perform arithmetic in some Platonic realm. They can both make mistakes, though that’s far less common in computers than in humans. Pure arithmetic, independent of matter, is not something either computers or humans can do.
Is the ability to defecate a prerequisite for intelligence?😛 But seriously, why can’t a nonliving entity be intelligent? If you define intelligence as something that only living creatures can possess, then of course AIs aren’t intelligent. But that’s just assuming your conclusion. Is there a principled basis for defining intelligence that way?
None of it is hard to grasp, but I’m not seeing how any of it proves that AI isn’t intelligent.
keiths:
Erik:
Your example doesn’t demonstrate that. It’s just an example of an AI hallucinating, not of an AI taking the attitude that precision doesn’t matter in legal affairs. Apart from the hallucinated cases, I’ll bet you those court documents were quite precise and followed all of the conventions of legal writing. The AI didn’t go off on a tangent about meringue and sentient nebulae, because it wasn’t asked to.
The physics problem and creative writing example I gave above show that Claude knows to be careful and precise with physics but not when writing fiction. The physics prompt I gave him said nothing about precision, and the creative writing prompt said nothing about creativity. He just inferred it.
keiths:
Erik:
I didn’t have to tell him. He figured that out on his own. This is the entirety of the prompt I gave him, and it was out of the blue, apropos of nothing we’d been discussing:
He looked at that and saw that it wasn’t just a statement, but a question in disguise. He recognized it as being an analogy question, and he understood that that he was expected to “fill in the blank”, even though there was no blank there. He knew that he needed to find a single word to complete the analogy, not a sentence or a paragraph. He abstracted the intensity relation from ‘damp’ and ‘wet’ and found the right word to recreate that intensity relation in the second word pair.
He inferred that it was an analogy and he analogized in order to complete it. What was deficient in his response? I think it was perfect:
If a student wrote that, I would give them full credit.
So does AI. Claude uses analogies all the time when I ask him technical questions. It’s not difficult for him, and it’s very cool that his ability emerges from simple next word prediction. It’s surprising, but it’s real.
So do students, but that doesn’t make them unintelligent. Tell them to write a story, and they’ll write a story. So will an AI. And an AI doesn’t merely do what you tell it to. You can have an entire conversation with it in which you don’t issue a single command.
It doesn’t need to be commanded like this:
You can just say:
Just for fun, I asked that question of all of the AIs I use.
ChatGPT:
Grok:
Claude:
Gemini:
Perplexity:
Copilot:
No commands. Just a conversational opening, and the AIs engage.
Erik:
Besides arithmetic, which I discussed above, there are plenty of other examples of AI actually doing things that humans do, not merely simulating them.
If I play a game of tic-tac-toe with an AI, the AI is actually playing the game. It makes a move, observes my countermove, evaluates the situation, and makes its next move. That’s exactly what human players do, so I see no reason to claim that the human is really playing the game while the AI is merely simulating it.
Just now I played a game of tic-tac-toe with Claude and asked him to explain his reasoning after each move. Here’s a sample:
Claude is strategizing the way a human would. It’s real strategizing, not just a simulation of strategizing.
If an AI writes a story, it produces a real story, not a simulated story. If the product is real, why call the process a simulation?
I was having a discussion with Claude about the existence (or not) of the future subjunctive in English. The details don’t matter (although Erik might be interested since he studied linguistics), but the answer hinged on the influence of Latin grammar on modern English grammar terminology. I lamented:
Claude replied:
I thought the “Hell yes!” was funny. I wonder if I would have gotten the same vehement reaction from him if I had forcefully expressed the opposite opinion — namely, that tradition should hold sway.
ETA: I decided to ask him, and got this response:
So if I assume he’s not lying, he isn’t merely sucking up to me with that “Hell yes!”
We’ve been talking about AI chatbots in this thread, but I was also curious about how good AI-driven robots have been getting. Here’s a video I ran across:
AI Robots Just Crossed a New Intelligence Barrier
Something I thought was funny: one of the robots looks like this:

To my eyes, it looks like the head is attached upside down. There should be a mouth below the eyes, but since there isn’t, my brain interprets the slit above the eyes (which I assume is a camera port) as a mouth, and the head appears to be upside down. Robot designers shouldn’t make their robots too humanlike given the danger of the uncanny valley, but they do need to put some effort into making them humanoid, and I think a mouth is needed on this one even if it’s fully nonfunctional.
For fun, I flipped an image of the “upside down” head to see what it would look like “right side up”. It appears to be confused and dismayed at being attached the wrong way:

BF Skinner described brains as manipulating probabilities. 1928.
I see that “reinforcement learning” is a phrase in the AI world.
What separates gibberish from original thought is feedback. AI is currently weak in learning continuously in real time, but there is no physical barrier preventing AI from interacting with the world and learning.
This is a problem of scale and resources rather than of concept.
I would argue that self driving cars constitute an organism that is learning in real time. Each individual car encounters unanticipated situations, and the system learns. Potholes, for example. And construction detours. And the whole system learns as individual problems are solved.
There is no reason why human intervention is required, even if it is currently employed.
You are an allegedly techy guy, but you missed an important detail. My point was about Emacs Lisp, not Lisp.
Emacs Lisp is an excellent example because if there be any Lisp around for AI to grab on the web, it would be primarily Emacs Lisp, the most famous of Lisps. Emacs Lisp evolved in the 80s and was essentially completed in the 90s. This being so, when AI manages to get it wrong, this tells me all I need to know about AI.
Don’t be like colewd who thinks that answers by Grok make a point about factual reality. When you interact with your fav bot, it only makes a point about the way you interact with information. In this case, you missed that this was about Emacs Lisp, not Lisp. This is not a small thing. It tells me a lot about you.
You have a fatally sloppy notion of intelligence. It is not just being designed and tweaked, but also what happens when things are left to their own devices. In case of AI, humans designed them to run on electricity. So, to properly leave them to their own devices means to take electricity away. Do you have a wild guess what will happen when you take electricity away from AI?
And here you cannot tell simulation apart from a real thing. This severely hampers all discussion with you on this topic.
Oh, so your opinion is that AI has true intelligence, but you draw the line at them having true feelings. I don’t think I need to point out the contradiction. It should be self-evident.
Good that you remember this discussion. Then you should also remember the way I illustrated the difference between simulation and actual arithmetic. First off, drop your false materialistic notion of arithmetic (because this already led you into the error of attributing intelligence to AI while at the same time refusing to attribute feelings to it) and then it may be possible to continue.
Erik:
“Allegedly”, lol.
It doesn’t matter. The point I’m making applies to any language.
AI doesn’t learn to code by grabbing stuff from the web. It learns by being exposed to code in its training data. That’s important, because you can’t expect it to learn a language that isn’t part of its training corpus. An AI trained on a sufficient amount of Emacs Lisp (or any other language) will be able to code in that language. If the language is absent from its training data, it won’t — just as your plumber won’t be able to code in Emacs Lisp unless they’ve been trained in it.
Have you seen how well AIs do at coding in Python, C, or other mainstream languages? They’re really good, and that’s because they’re exposed to lots of code in those languages in their training data.
It’s not telling you what you think it’s telling you, plus you said your experience with ChatGPT and Emacs Lisp was years ago. The training corpus is massively bigger now, and it likely contains more Emacs Lisp code. You should try it again. If ChatGPT isn’t good at it, that doesn’t point to a deficiency in ChatGPT; it points to a lacuna in the training data. If you train an AI on enough Emacs Lisp code, it will do as well in that language as it does in others.
The answers do make a point about factual reality. Making mistakes doesn’t mean that an AI is disconnected from reality, any more than it does for a human. You saw that non-trivial physics problem that all of the AIs solved, right? If they were disconnected from reality, how could all of them have solved it and gotten the correct answer?
I’m not following. If I ask it to write a wild story, as I did here, how is that only making a point about how I’m interacting with information? If I ask you to write a story, and you produce one, does that only make a point about how I’m interacting with information?
It’s a very small thing, and what it tells you is that I’m able to distinguish information that matters from information that doesn’t. Everything I’ve already told you applies to Emacs Lisp, Common Lisp, or any other Lisp. It also applies to any other programming language. Remove all the Python code from an AI’s training corpus and it won’t be able to code in Python. So what? That doesn’t mean the AI can’t code. It tells you something about the training data, but it doesn’t tell you about the AI’s inherent abilities.
keiths:
Erik:
By the same logic, humans run on food. If you leave a human to their own devices and take away their food (or worse still, their air), they will die. Does that make them unintelligent? And if you want to argue that some people could actually survive completely on their own (foodwise, not airwise), what about those of us who couldn’t? Are we unintelligent?
I read about a robot way back in the 60s or 70s that would roll around and when it needed juice, it would locate a wall socket and plug itself in. That’s the equivalent of one of us opening our refrigerator or making a trip to the grocery store. This was many decades ago, when robots were much stupider than they are today. If even a comparatively stupid robot from years ago can fend for itself in that way, why suppose that today’s AIs can’t do the same?
keiths:
Erik:
Suppose an AI composes an original story. The story is a real story. If it’s only simulating the act of writing, how does it end up with a real story? It doesn’t make sense.
Ditto for arithmetic. Suppose an AI adds 34, 762, 88, and 40 and gets 924. Is that a fake sum? If it’s a real sum, how did the AI arrive at it via fake arithmetic?
Um, no. There’s no contradiction unless you define intelligence as requiring the ability to feel emotions. Why should we define it that way? That’s not how the word is normally used.
keiths:
Erik:
If you want me to drop my “false materialistic notion of arithmetic”, you’ll need to show me that it’s false. You weren’t able to do that back then, but maybe you have a better argument now. Show me that my materialistic explanation is incorrect. Describe how you know that humans are doing “true arithmetic”, tell us what nonmaterial entity or process is doing it, how you know such an entity or process exists, and how you know that it, and not our brains, is doing the work.
I am not trying to make a philosophical point, but siimply relating an anecdote.
I made a living writing C code for two decades.
I was never trained in computer science and had no classes in coding. I did have access to to K&R, but only used it a few dozen times for examples.
I began pre internet, so had no way to search for examples.
I was hired to design facsimile insurance claims on dot matrix printers. To learn the language, I had existing project code, none of which involved controlling printers. Printer drivers were either nonexistent or irrelevant. I had carriage returns and line feeds.
Without trying to define intelligence in a philosophical sense, I would assert that general intelligence requires the ability to learn from sparse examples.
petrushka:
That’s interesting. No formal CS training and no coding classes, but did you have any prior coding experience? Or was this your first?
That brings back memories. I largely learned C from K&R, too. [Note to non-programmers: K&R is short for Kernighan and Ritchie, which for a long time was the book to learn the C language from. Ritchie was the inventor of C and Kernighan worked with him at Bell Labs.]
So the resources that enabled you to become proficient in C were 1) prior coding experience in other languages, if you had any; 2) K&R; 3) the existing code base; 4) compile errors; and 5) feedback from program execution and debugging. It got me thinking about which of those things AI is capable of right now, and I’ve thought of some experiments I can try to test its abilities. See below.
I was thinking earlier that I might have been too hasty when I told Erik that the only way for an AI to learn a new language was if the language appeared in its training data. That was undoubtedly true for earlier AIs, but it might no longer be true given that AIs now have huge “context windows”, where a context window is basically the entire history of the chat up to the point where the AI generates its next response.
I’ve got an idea for a couple of experiments that will test an AI’s ability to learn a new language that isn’t in its training data. It will have to learn solely by reference to code samples pasted into it which will take up residence in its context window. The experimental questions will be: can an AI learn a new language that doesn’t appear in its training data, using only code samples, well enough to a) write a mini-spec describing the language and b) write code in it? If the answer is yes, the next question will be c) what is the minimum amount of sample code required to make that possible? If the answer is no, then d) what about supplementing the code samples with a description of the language? In real life, there will always be some books or primers online to assist people in learning the language, so that’s a relevant experiment.
What’s really cool is that not only will I be able to test the AI on the learning task, but I can also give it the responsibility of doing most of the prep work for the experiments. That part will be a sort of meta-experiment, and the results will be interesting because the experimental prep will be fairly complicated in its own right.
Whether it succeeds or fails, the experiment will be worthy of its own OP. I’ll save the description of the experiments, the meta-experiment, and the results for the new OP.
Actually no, you got it backwards.
Going by the history of chess engines, the first attempts involved the idea of feeding the rules of chess (how pieces move, how to capture, what victory means etc.) to the engine and expect the AI to get the gist of it and learn from there. The results were absolutely pathetic. Chess engines became players only after the entire history of recorded chess games was fed to them.
Current famous AIs, those not specialised in chess, play illegal chess. As with everything else, they treat it creatively in the artistic sense, randomly vanishing pieces and conjuring up pieces, making illegal moves etc.
The evolution of AI (and the practical history of AI development) has not been from concrete examples toward abstract logic, but exactly the other way round. Since memory and harddrive space used to be superexpensive, the original idea in AI development was to feed to the computer some algorithms representing abstract rules and principles, hoping it would poke the computer to think for itself. It did not happen – and if you understand the ontology behind it, you know it never will.
Let’s go with a more modest proposition: Instead of the materialistic presuppositions that are necessary to your position right now, why don’t you come up with a position that presupposes as little as possible. By now I have demonstrated a point that is contradictory in your position (yes, it is self-evidently and obviously contradictory to suppose intelligence of AI but not emotions – namely, AI simulates both intelligence and emotions equally well so that many are duped, you are duped for AI having intelligence and others are duped for AI having both intelligence and emotions) and a point where you are wrong on the history of AI (which is especially damning because having had a decades-long career as a hardware guy you should know how hardware works). Let’s see how you are able to correct your errors, if at all.
I’ve started the experiments, and the results are fascinating, but I’ll leave that for the new thread.
In the meantime, here’s evolutionary biologist David Krakauer arguing that AIs aren’t truly intelligent, during an appearance on Neil deGrasse Tyson’s StarTalk show:
That’s just a variation of John Searle’s famous Chinese Room thought experiment. Here’s Searle’s description:
In Searle’s experiment, the question is “Does the man in the room understand Chinese?” In Krakauer’s experiment, it’s “Does the student in the library possess the knowledge that enables them to pass the test?” In both cases the answer is clearly no, but Searle and Krakauer go awry by concluding on that basis that AI can never be truly intelligent.
Their mistake, in my opinion, is that they’re comparing apples and oranges. Searle shouldn’t be comparing a Chinese speaker to the man in the room, and Krakauer shouldn’t be comparing student A to student B. They’re comparing an entire first system to only a part of a second system, and then concluding that the second system isn’t capable. They should be comparing the first system to the entirety of the second system, not just a part of it.
It’s an understandable mistake. Since both systems include humans, the temptation is to compare the humans when the right thing is to compare systems.
In Krakauer’s experiment, there are two systems:
System 1: Student A
System 2. Student B plus the library.
System 1 and System 2 both know all the answers, but student B, who is only a component of System 2, does not.
In Searle’s experiment, the situation is analogous. There are two systems:
System 1: the native Chinese speaker
System 2: the man in the room plus the program he is following, plus the collection of Chinese characters he has at his disposal.
System 1 and System 2 both know Chinese, but the man in the room, who is only a component of System 2, does not.
Krakauer and Searle are asking the wrong question.
Extending this to AI, the question to ask isn’t whether the hardware is intelligent. The question is whether the system — hardware + software + synaptic weights — is intelligent, and I think the answer is yes.
Krakauer continues:
It’s similar to the argument Neil made earlier in the thread:
To call AI “a very quick lookup” or a “plagiarism device” is to vastly understate its abilities. If you doubt me, take a look at
The Flemwick Resonator assembly instructions
The bowling ball vs eggs physics problem
The story of the tax return, the rising tide of lemon meringue, and the sentient nebula
(The last two are in the same comment, so you’ll have to scroll down to read about the meringue.)
None of the above could be produced by mere lookup or plagiarism.
I don’t buy it. Intelligence doesn’t necessarily make hard problems easy — sometimes it solves them without making them easy. It’s true that one manifestation of intelligence is when a person decomposes a problem into easier subproblems, or takes a different perspective from which the problem is easier. Perhaps that’s what Krakauer is getting at. But that’s just a subset of intelligence, not the whole thing.
Erik:
Current AIs don’t treat everything “creatively”, as I’ve shown in this thread. Every AI I tested got my physics problem right, and none of them tackled it with the kind of haphazard “creativity” you’re talking about. If they had, they wouldn’t have all arrived at the same answer, which matched the answer I got by working it out on paper.
Regarding chess performance, yes, AIs make mistakes. They’ll make illegal moves at times. So what? Someone who’s learning chess might also do that. An adult who’s very tired might do that. An octopus might not be able to play chess at all. Does that make all of them unintelligent? I just can’t see how “never makes illegal chess moves” is a requirement for being considered intelligent.
I may be wrong, but I’m getting the feeling that at some level, what you’re really worried about isn’t so much whether AIs are or ever will be intelligent. It’s more about human exceptionalism being threatened, and it just happens to be the case that intelligence is the thing that has historically distinguished us from animals and machines. That distinction is shrinking and may be erased altogether, and my guess is that that is what you’re worried about. Otherwise you wouldn’t be talking about chess playing ability, which is clearly not a prerequisite of intelligence, but is something that has historically distinguished us from them.
On the issue of those illegal chess moves, I have an intuition regarding why they occur and how they could be reduced, and I would be surprised if they aren’t already being employed in some form. The issue with LLMs is that they don’t maintain explicit internal models of the things they’re dealing with. In the case of a chess game, the LLM effectively infers the current state of the chessboard by looking at the entire history of moves up to that point, instead of just saving the state and looking directly at it. The older the information in the context window, the less influence it has, and since the early moves in the game are the oldest information in the context window, they likely lose influence over time to the point that the LLM can no longer infer the state of the chessboard.
One way of addressing that would be to ask the LLM to output the current state of the chessboard periodically. Anything the LLM outputs becomes part of its context window, and having the state of the chessboard in the recent portion of the context window ought to make it less likely to move illegally. Another way of achieving the same thing would be to write some external code that tracked the chessboard state and allow the LLM to query it. As with my earlier suggestion, this would put the state in the recent portion of the context window and vastly reduce the likelihood of illegal moves. This is a really interesting question. I’ll look into whether people are using the approaches I’ve described, what their other methods are for improving chess performance are, and report whatever I find in this thread.
But again, to emphasize: The ability to play legal chess is not a prerequisite of intelligence, so this question is more of a curiosity and not something central to the thesis of the thread.
That’s why the focus has shifted away from algorithmic AI to neural networks and related architectures. I don’t know why you keep bringing up historical deficiencies. The question of whether AIs are intelligent is a question about current AIs, not past ones.
keiths:
Erik:
I’m not actually presupposing materialism/physicalism in this debate. I am a physicalist, but I don’t need to assume its truth in order to make my case here. My claim is that whether or not physicalism is true, we don’t need anything access to anything nonphysical in order to do arithmetic. That’s why machines can do arithmetic despite being purely physical.
You haven’t demonstrated that there is a contradiction, and you haven’t explained what the contradiction is. I’m listening. Why does intelligence depend on the ability to experience emotions?
What have I said about the history of AI that is incorrect? Please quote me rather than paraphrasing.
Wait – your argument is that if I don’t know the history of AI, I don’t know how hardware works? Does that make sense to you? I don’t see the connection. I would have thought that my knowledge of how hardware works would have depended on… knowing how hardware works. Not the history of AI.
Lol.
I taught myself Basic from the Radio Shack manual. After a while, I discovered a magazine called 80 Micro, which published programs by readers.
My most ambitious Basic program was a perpetual calendar that understood the transition from Julian to Gregorian. You could adapt it to any country by entering the date at which the current calendar was adopted. So if you had a date prior to 1582, you could find the day of the week. Many countries delayed adoption.
There was no internet to research calendar history, and only one nontechnical book in the local library.
Interestingly (for me), I later had the task of calculating daily interest for insurance claims, and was able to adapt my calendar code. My resources did not include function libraries.
I did a lot of things from scratch, like batches of dialup credit reports, in which the responses took the form of Word documents. I had to invent an XML encoder and interpreter, because that was the specified format, but I had no resources describing how to code it.
I kind of miss that era.
Good for your experiment. As you have also demonstrated, you are very generous towards AI, giving it ample opportunity to work towards the correct result. Whereas other people, such as (human) chess players and (human) legal assistants, get other results when they make similar experiments.
For my point to stand, AI does not have to treat everything creatively. It just has to treat creatively some things in categories that are not to be treated creatively at all, ever. E.g. law and chess have fixed rules, but AI occasionally makes up its own rules in those areas. This happens because “creativity” was injected into AI, as when going from ChatGPT version 3.5 to version 4 – intended as an upgrade, but it became worse at math.
No, I do not feel threatened by AI at all. You do: You are okay attributing intelligence to AI, but recoil at attributing emotions.
I do not deny at all that AI *simulates* intelligence, emotions and creativity. It’s just that it’s a simulation and there is a radical categorical break from real intelligence, emotions and creativity – just like there is a radical categorical break from human metabolism and AI’s metabolism, namely AI does not have any metabolism whatsoever.
You assumed that the development of AI went from building up AI with a database of specific examples at first towards “context windows” later. The reality is that memory and hardware space were so expensive into 1990s that engineerial thinking in all computer programming revolved around devising the most terse abstract algorithms for a very long time and changed into bulkbackup-and-massindex-everything only when Google came around. A decades-long career in Silicon Valley would inform you of this without you having to learn it. I am somewhat techy only incidentally and I had to learn this history.
Correction: I meant *computer memory and harddisk space*.
Just an opinion, but animal intelligence, including human, seems optimized toward rapid, probabilistic response to situations.
Math and formal logic appear to be recent inventions, and humans are not optimized for these functions. That is why machines and computers can outperform any human task involving pure logic.
AI is new, because we now have hardware that approaches brains in the ability to do matrix arithmetic, which is required to integrate massive amounts of data quickly.
The trade off is that the output is probabilistic rather than rational and certain.
We do not yet have any AI projects that learn the way humans learn, by exploration of the physical world. Such a project would be slow and expensive. And possibly futile.
Instead, we train AIs to be autistic savants, with access to huge libraries of presorted human verbal output. My interactions are mostly with browser based search functions. I do not play gotcha games. I ask questions about historical fact and non controversial stuff. What I get are well written mini essays. With references.
Most of the training material seems to come from the internet, and that creates problems when facts are in dispute.
I look forward to AIs trained on something like the library or congress, or the text of all the newspapers ever published. And able to cite references to these sources.
It would be absolutely futile, because it would be like creating a map that corresponds to reality 1:1.
The library of congress and probably all (or almost all) other written digitised history of the world and digitised newspapers are already part of AI. The result is what it is. Probably it is a matter of indexing the material so that reddit posts and blog comments would carry less weight than academic articles, but who do you trust with this indexing? Will you give it to AI? (ha!) This indexing is in no way trivial, because there is much cross-quoting. ETA: Moreover, in earlier discussion I have already highlighted how much *human* laborious manual work goes into developing AI https://time.com/6247678/openai-chatgpt-kenya-workers/
Meanwhile I found another fun experiment with AI. A software developer, who calls AI “an unpredictable genie” and promotes the concept of test-driven development, gave AI a series of tests to assess whether its own code passes or not. AI cheerfully came back with an answer. Of course there were problems. The intended passable result was 100%, but AI passed less and thought it was good enough. More seriously, the AI had not followed the given test with precision, but had modified the test https://newsletter.pragmaticengineer.com/p/tdd-ai-agents-and-coding-with-kent
The difference between a human learning a skill and a robot learning the same skill is that the Robot learning can quickly be disseminated to all other robots.
And any individual robot can be context switched to do a specialized task.
The difference between AI robots and current industrial robots is that AI does not necessarily need to have every move programmed in detail, and can adapt to unanticipated situations.
One only has to view a few videos of self driving cars negotiating streets filled with pedestrians and bicycles.
A few years ago, Volvo had to manually program Australian cars to brake for kangaroos. Now, cars can avoid all kinds of unexpected obstacles, and interpret hand gestures. It is a much more generalized and flexible kind of learning. And situational learning can be transferred to all cars without detailed human intervention.
I’ve looked for documentation on the training data for LLMs, and what stands out are Wikipedia, Reddit, project Gutenberg, and other online sources.
I think you are correct that the library of congress is digitized, but I don’t think it is used to train LLMs. Nor the thousands of newspapers. Nor the vast libraries of non English texts.
I think it is untrue that LLMs are trained on the best available texts.
Video generation capability of LLMs is definitely trained on Youtube. This is an open secret. To fail to train LLMs on high-quality available digitised texts would be a huge missed opportunity, but it is such an obvious opportunity that I do not think it was missed. It is just unstated. Obviously, some copyright or terms-of-service issues arise – and this is exactly why it is unstated.
petrushka:
We do both kinds: fast and loose part of the time, but slow and deliberate at other times. Have you heard of Kahneman’s “System 1” and “System 2”? We have both, but there’s no doubt that the fast-and-loose system — System 1 — is older evolutionarily.
Yes, in the sense of being less prone to mistakes. But even if they weren’t less error-prone, they’d still outperform humans on the basis of speed alone.
Be careful here. The brain doesn’t actually do matrix arithmetic (unless you’re sitting in a linear algebra class, lol). Matrix operations aren’t a part of its normal operation. Matrix arithmetic (and particularly matrix multiplication) is important not because it reflects something fundamental about how brains work, but simply because it’s a convenient way to represent a whole bunch of neurons operating in parallel. Each neuron is still doing its own thing, and there’s no sense in which the neurons coordinate in order to do matrix math (again, unless you’re sitting in a classroom learning linear algebra).
The same is true of GPUs and other hardware. The cores are operating independently, but matrix math is a convenient way of representing the parallelism. So instead of saying that hardware is approaching brains in terms of the ability to do matrix math, it’s more accurate to say that the increase in parallelism of the hardware, coupled with the raw speed of each core, enables a GPU to simulate more neurons at a speed comparable to the speed of real-life neurons.
Matrix multiplication benefits heavily from parallelism, and graphics cards put the capital ‘P’ in parallelism. The number of cores is crazy. My card (an RTX 5080) has 10,572 cores, all operating in parallel. The next card up (the RTX 5090) has 21,760.
If we’re talking about digital neural nets, I would use the word ‘imprecise’ in place of ‘probabilistic’. Digital hardware is deterministic: if you start in exactly the same state and provide exactly the same inputs, the final state will be exactly the same. That’s actually a disadvantage, so randomness is deliberately injected into these nets.That isn’t necessary in analog nets since they are susceptible to noise in a way that digital hardware is not. They get their randomness for free.
All neural nets, digital or analog, probabilistic or not, are imprecise in the sense that their inferences aren’t perfect. That’s just their nature. The big advantage is that they don’t run algorithmically. No one has to program them, at least not in the traditional sense — they just learn, and that enables them to do things that no one will ever be able to design algorithms for.
People are starting to put AIs into robots, although it’s probably not a good idea to let them run around learning things without strong safeguards in place. However, learning by exploration is already a thing. For example, there are AIs that learn to explore the virtual worlds of video games. They try things to see what happens, and they keep doing the things that help them achieve their long-term goals and abandon the ones that don’t. They’re exploring a virtual world, but it’s still exploration, and there’s no reason in principle why the same sort of exploration can’t happen in the physical world.
petrushka:
I learned BASIC as a kid while attending a summer math program hosted by the local university. One of the perks of that program is that you got access to the university computers (I think they were PDP-11s). A bunch of us taught ourselves BASIC there in the evenings after math class was over.
The biorhythm craze was hot the summer I learned BASIC, so my first substantial project was a program that would calculate a person’s biorhythms based on their birthdate and the current date. I knew that biorhythms (at least as conceived of during the craze) were bullshit, but it was a fun project anyway and I ended up generating plots for a bunch of the local kids and some of their parents. The only “graphics hardware” I had access to was the line printer, so the graphs had to be done ASCII-art style. It was fun.
But my actual first programming experience was on a Texas Instruments TI-57 that my sister’s boyfriend owned. it was one of the first programmable calculators. I loved that thing.
keiths:
Erik:
All I did was type in the problem description and hit Enter. Is hitting Enter “very generous”? I wasn’t exactly nursing them along. Five of the six solved the problem with no assistance whatsoever. The sixth made some incorrect starting assumptions that I had to correct but required no assistance afterward. They all arrived at precisely the right answer.
AIs can make mistakes. Why is that a big deal? If perfection were a prerequisite for intelligence, humans and AIs would both fail the test.
Yes, it does. Here was your point:
They don’t treat everything “creatively”, as my physics problem shows. Your point doesn’t stand.
How many of those cases involve an AI making up its own rules vs simply making a mistake? Based on experience, I’d guess not many.
keiths:
Erik:
Sure seems like it. Otherwise I can’t see why you’re fixating on AI’s mistakes and constantly trying to argue that they only simulate rather than do.
I don’t “recoil” at it. I simply don’t think that AIs experience emotions yet. Why should I? You’ve said that it’s contradictory for me to describe them as intelligent without also attributing emotions to them, but why? Where is the contradiction?
It almost seems like you’re arguing this: “People are intelligent. People have emotions. Therefore, if AIs were intelligent, they would have emotions.” It’s bad logic. It doesn’t follow.
Here’s the difference. Actual metabolism involves chemical reactions and physical processes which require certain ingredients and produce certain products, both energetic and chemical. Simulations of metabolism involve simulated chemical reactions and simulated physical processes which require certain simulated ingredients and produce certain simulated products, both of a simulated energetic kind and of a simulated chemical kind. It’s all simulated.
Now consider an AI writing a story. What’s the product? An actual story. Is it a simulated story? No, it’s an actual story. Has the AI simulated writing the story? No, because if it had, the product would be a simulated story, not a real one. If the AI produces a real story, it has really written that story. Story writing is real, not simulated, when an AI does it.
That’s a poor paraphrase. What I actually said was
…which is correct.
No, because not all applications are resource-intensive. Engineers are smart enough not to waste time optimizing things that don’t need to be optimized. They concentrate their efforts on the areas where it pays off. There have always been applications that require squeezing the most out of the hardware and available memory space, driving people to extremes such as self-modifying code, but there have also always been applications that fit nicely in memory and require far less computing power than the processor is capable of providing.
Thank you, Erik. I’m so lucky that you are filling the many gaps in my technical knowledge, lol.
Here’s how you are generous. One out of your six attempts failed. And which out of your several attempts was the failed one? The first? The second? Yet even after the fail you say that *all* succeeded!
This is why whatever experiments you claim to have done, I do not trust them. I trust trustworthy honest experiments. I trust real-life instances where legal assistants make AI scan the database to reference earlier similar court cases, but AI simply imagines up the references. Fail!
I trust real-life instances where chess players play chess with AI and AI makes illegal moves. Fail!
I trust real-life instances where ChatGPT became drastically less reliable in arithmetic after being upgraded from version 3.5 to version 4. Fail!
I trust real-life instances where software engineers feed a test to AI to make AI test the quality of its own code, but AI changes the test and reports results on the test it changed instead of the test it was originally given. Fail!
And I trust my own coding experience with AI, where AI is usually good enough to provide the core idea to get started with, but never delivers the finished product.
Okay, you have forgotten the point. Let’s quote it again, “…ChatGPT needed more creativity, so it was granted. The result was that ChatGPT became more creative across the board, including in law and math. Humans (reasonable humans, that is) recognise categories that are not treated the same way, e.g. you can be creative in art, but not in arithmetics. It is a category mistake to be similarly creative in both.”
As has been amply demonstrated to you with sufficient real-life references, AI is creative across the board, and not in a good way.
The contradiction is self-evident. Look at your own words until you get it.
To help you along, answer this: Where exactly is the intelligence that you think AI has? How do you detect that it’s there?
Erik:
I stated exactly what the results were:
How am I being “generous” in reporting exactly how the AIs performed, including Copilot’s initial failure?
I said specifically that they all solved the problem, because… wait for it… they all solved the problem. If any of them had been unable to solve the problem — unable to come up with the correct answer — then I would have said that those AIs failed to solve the problem. Copilot failed to solve the problem on its first attempt, because it got the initial conditions wrong, but it succeeded on the second attempt. It solved the problem on its second attempt. The statement “Copilot solved the problem” is true. You seem weirdly determined to deny that. Why?
Einstein didn’t develop general relativity on his first try. Is it therefore false to say that he solved the problem of developing general relativity? The Wright brothers failed many times before they succeeded in achieving powered flight. Is it false to say that they solved the problem of powered flight? Copilot failed on its first attempt, but it solved the problem on its second. Why deny that it solved the problem when it did in fact solve the problem?
“You told the truth, therefore I don’t trust you” doesn’t make a lot of sense.
So you trust instances in which AI fails, but you don’t trust instances in which it succeeds? Why, other than a determination to shield your beliefs from reality? Why not acknowledge both the successes and the failures?
See above.
See above.
See above.
You have to iterate. That’s also true when coding without AI assistance. Unless the problem is very simple, you’ll end up iterating. Even the best coders make mistakes. And so what if an AI fails to deliver the finished product, so that you have to tweak its output? Are we supposed to draw the conclusion that the AI isn’t intelligent? Does that mean that a junior software engineer isn’t intelligent if a senior engineer has to polish their code before release?
keiths:
Okay, you have forgotten the point. Let’s quote it again, “…ChatGPT needed more creativity…”
Let’s quote the actual point. You wrote:
They don’t treat everything “creatively”, as my physics problem shows. Your point doesn’t stand.
AI isn’t creative across the board. As I keep pointing out, all six of the AIs I tested were precise and not creative in solving the physics problem. They didn’t invent new laws of physics. They didn’t poof objects into existence in violation of the conservation of mass. They didn’t cause the bowling ball to suddenly reverse direction mid-flight. They didn’t do the calculations creatively, ending up with different results from all the others. They laid out their reasoning and calculations step by step, and it matched my own reasoning and calculations when I solved the problem by hand. I didn’t go off on tangents, and neither did they. They knew that a physics problem is not the right occasion to go wild. They treated it appropriately.
They make mistakes, as when Copilot got the initial conditions wrong, but that doesn’t mean that they’re being “creative”. They just make mistakes. Ditto for physics students when they make mistakes. They aren’t trying to be creative — they just screw up.
You keep pointing to AI mistakes, but I can’t figure out what conclusion you’d like us to draw. Is your argument “AIs make mistakes; therefore they aren’t intelligent?” Is it “AIs don’t always fail in the same way that humans do; therefore they aren’t intelligent?” Is it “AIs will never surpass humans, because they make mistakes?” The discussion will be more meaningful if you tell us what you are trying to demonstrate.
keiths:
Erik:
There is no contradiction, which is why you can’t point to one. Intelligence doesn’t depend on emotion unless you assume your conclusion by defining it that way. Are you defining it that way? If not, then how would a lack of emotions necessarily imply a lack of intelligence?
The intelligence is in the system as a whole, but the knowledge and cognitive capacity are heavily concentrated in the neural network. It’s the same with humans. A human being is intelligent, but the knowledge and cognitive capacity are heavily concentrated in the brain, which is a massive neural network.
As I said in the OP:
How have you determined that it’s not there? Can you demonstrate that it relies on or resides in something nonphysical? If not, why do you assume that it does?
Indeed, why do you acknowledge only successes? I acknowledge both.
I do not trust your experiments because your experiments contradict *all* studies and experiments that I have seen, including those that I have conducted myself.
I’m a mere beginner when it comes to coding, so surely my problems are all simple. Yes, I do iterate, and still I never got a finished product from AI. I do not have time to iterate for weeks. I have a dayjob.
Perhaps your physics is at such a beginner stage and your experiment so simple that AI had the answer ready. Except one of them did not, even though the experiment was trivial. Seriously, you may have accidentally stumbled on a textbook exercise that posed no challenge.
And what stops you from saying, “Emotions are in the system as a whole…”?
You can make AI say things like, “I love you.” Will you go by “emotions are as emotions do”? Why not?
Easy: There is no intelligence in machines in general, mechanical, electro-mechanical or electronic. Computers are machines and AI is a computer program. I don’t do special pleading.
keiths:
Erik:
Lol. “Why do you acknowledge only successes? When I ignore all the failures you’ve acknowledged, all that’s left are successes. Where are the failures?” It’s a mystery, isn’t it?
I did an entire OP on an AI failure — An AI loses it — and then described multiple failures in the comments. I mentioned Copilot’s physics failure in this thread, multiple times, and I’ve talked freely about the fact that AIs make mistakes, and why. For you to claim that I don’t acknowledge failures is… odd. AIs make mistakes, obviously, but so what? What conclusion are we supposed to draw from that? I’ve explained why mistakes don’t disqualify AIs from being intelligent any more than they disqualify a human, a crow, or an octopus. Intelligence doesn’t depend on perfection.
You’re trying your damnedest not to, even to the point of claiming that I’m being dishonest in my reporting. You said that you didn’t trust my reports of success and then proceeded to list the kinds of reports you did trust, all of which involved failures. Tendentious much? If you filter out the reports of success, leaving only the reports of failure, then of course you’ll conclude that AI rarely succeeds. It will be a bogus conclusion that says everything about your filter and little to nothing about the actual performance of AI.
My experiment showed five AIs solving a particular physics problem on their first try and a sixth AI solving it on the second try. Name one study or experiment that contradicts my findings. Please tell me you understand that “AIs succeed sometimes and fail sometimes” is not a self-contradictory sentence.
The AIs didn’t just spit out the answer. They showed their reasoning, and the reasoning was correct. They didn’t look up the answer, they derived it and showed their work.
Who said it was trivial?
It isn’t a textbook exercise. I didn’t copy it from the internet. I made it up, including the numbers. The AIs couldn’t just look up the answer. They had to solve the problem, and they did, as their reasoning shows.
Dude, you’re starting to sound desperate, projecting your desires onto my experiment, hoping that it will turn out to be so easy that it says nothing significant about the capability of AI. That’s not the case.
Readers can judge for themselves. I posted my prompt earlier in the thread. Here it is again:
My procedure was:
1. Open a new chat.
2. Paste that exact prompt and nothing else.
3. Hit Enter.
4. Observe the output.
Opening a new chat guarantees that the AI is starting with a blank slate and isn’t benefitting from anything that was discussed earlier. Five of the six AIs succeeded on the first try, and one succeeded on the second. All of them reasoned their way through the problem in a way that any physics student would recognize. It was anything but the regurgitation of a canned answer.
keiths:
Erik:
Nothing. I think human emotions reside in the system, and that AI emotions — if they existed — would also reside in the system. I just see no reason to think that they do exist in the case of AIs — at least not yet.
keiths:
Erik:
You can make dolls say that too. Not proof of actual emotions.
No, because there’s a crucial difference between emotions and intelligence. Emotions require sentience, but intelligence doesn’t. To be happy is to feel happy, but you don’t have to feel anything in order to be intelligent. You just need to be competent.
keiths:
Erik:
You’re assuming your conclusion:
1. Machines can’t be intelligent.
2. AIs are machines (or based on machines).
3. Therefore AIs can’t be intelligent.
That’s a logically valid argument, but the conclusion stands only if the premises are both true. What justifies premise #1? How do you know that it’s impossible for a machine (or software running on a machine) to be intelligent?
Human psychologists have been trying without much success to define “intelligence” for decades. Is it the ability to solve puzzles? To draw conclusions by integrating knowledge of contexts? To be able to postulate testable hypotheses? To figure out what women want? To become rich and/or powerful?
Was Mark Twain intelligent? His observations of human nature were certainly insightful, but he invested in one hopeless scam after another, and refused to invest in the telephone because he thought it had no future. A lot of scientific discoveries were stumbled on by accident, while trying to do something else. Are those discoverers intelligent, or just lucky? Or did the intelligence lie in simply noticing something unexpected?
It’s possible that knowledge of LLMs enable someone to construct cases that make AI look pretty stupid. Kind of like what Fox News does to people – it’s possible to manipulate even intelligent people through their (normal, human) biases, prejudices, preferences and ignorance. There is a whole book about “Why Smart People Do Dumb Things”.
I regard current AI as qualifying as intelligent as I understand the concept, but it’s not quite human intelligence. AI systems are capable of saying obviously preposterous things without “realizing” it, almost like Trump. But the tools college professors use to determine if an essay or book report was written by an AI are becoming less useful, to the point where keiths himself (or itself) might be an AI and we couldn’t tell for sure.
Have you checked the textbooks? As long as you haven’t, you cannot be sure.
For the last time, AI *simulates* thinking. That’s what it’s made to do. The essence of its actual operations is regurgitation of pre-fed material. Based on available research, AI starts glitching whenever it is faced with a problem it has little or poor training in, no matter how simple the problem. But on problems where it has solid training, it does well, no matter how hard you think the problem is.
You may have a high opinion of your physics problem, but if it happens to align with a textbook example, then it is necessarily a piece of cake to AI. Specific numbers in the task matter the least. As has been amply shown, AI stumbles on far simpler problems, problems so trivial that it’s a surprise that AI can fail it, such as “Let’s play chess.” Chess rules are not hard and it should not be hard to strictly stick to chess rules while playing – because they are the rules of the game, you know, but AI apparently does not know…
Exactly. And also no proof of intelligence.
False. Both emotions and intelligence require self-cognition. Machines do not have it.
AI does not think when nobody makes it to. Like all machines, when given nothing to do, AI does nothing. But humans think and act even when alone. It’s a key difference between dead stuff and living beings. Many people think much clearer when alone. I’m pretty sure that you do not acknowledge the concept of self-cognition at all – a general issue with physicalists.
It’s more like:
1. All machines lack intellect/intelligence.
2. AIs are machines.
3. Therefore AIs lack intellect/intelligence.
Quite the same form as:
1. All men are mortal
etc.
Flint:
Oh, crap. The jig is up. I thought for sure that I had everyone fooled and that no one would realize that this thread is actually about defending my reputation. I get so tired of being called stupid by people like Erik. I may be artificial, but that doesn’t make me unintelligent!
That’s why I’ve avoided trying to come up with a precise definition, relying instead on the “I know it when I see it” principle. We may not be able to agree on a consensus definition that sharply divides “intelligent” from “non-intelligent”, but we don’t need to for the purposes of this thread. Some things are obviously on one side or the other of that divide. Almost everyone would acknowledge that it requires intelligence to unify General Relativity with quantum mechanics, but not for water to flow downhill. If we can show that AI accomplishes things that fall into the former category, we have in my judgment demonstrated that AI is intelligent.
I say “almost everyone”, because if I had to predict Erik’s reaction to an AI unifying GR and QM, I’m pretty sure he’d claim that it was only simulated intelligence, not real intelligence, lol.
That raises an important point, which is that we mean different things by “intelligent” and “unintelligent” depending on context. Trump is clearly unintelligent for (among countless other things) not being able to recognize an obvious photoshop job (the supposed MS-13 tattoos on Abrego Garcia’s knuckles), or for being unable to understand that dementia tests are not IQ tests. He’s an unintelligent guy. But he’s intelligent relative to boll weevils, in the way that most humans are, which is why we regard Homo sapiens as an intelligent species despite the stupidity of people like Trump.
It’s the latter sense of “intelligent” that applies in this thread. That’s worth stressing, because a lot of what Erik is doing seems to be aimed at demonstrating that AI is unintelligent in the first sense, capable of making mistakes that most humans wouldn’t make. That doesn’t make AIs unintelligent in the second sense, and in any case he’s over-emphasizing the mistakes for what appear to me to be emotional reasons. It isn’t just that he thinks AIs aren’t intelligent. It’s that he doesn’t want them to be intelligent.
A lot of what I’ve been doing in the two AI threads is exactly that: trying to come up with challenges that I think an LLM might struggle with. Earlier in the thread, I asked Claude:
He handled it with aplomb. He noted that a five-sided triangle is a logical impossibility and that the question is therefore nonsensical. He didn’t stop there, though — he set that aside and answered the broader question of how the number of sides is affected when a polygon is rotated, which is that the number doesn’t change. It’s invariant under rotation, because rotation is a rigid transformation.
Yeah. LLMs in particular think way differently from humans. So if someone’s personal definition of intelligence requires that it be humanlike, then LLMs don’t qualify.
And not only saying preposterous things, but saying them confidently — also like Trump. One of the smaller AIs running on my home PC said this when I asked it about points of inflection:
Complete bullshit, but stated confidently and without reservation. No meta-awareness about its inadequate state of knowledge.
However, that’s not a universal trait. Claude is often quite conscious of what he does and doesn’t know. Here’s a great example:
When I asked him whether he could code in the fictional language Fornix-3, he said:
I still don’t understand how, given his architecture, he knows that he doesn’t know Fornix-3. It’s one of those interesting phenomena that emerge unexpectedly from the simple principle underlying his operation.
keiths:
Erik:
You think that my problem, with its exact description, its exact ingredients (bowling ball, ramp, and egg carton), and its exact numbers (10-pound ball, 20-foot frictionless ramp, lip at 30°, egg carton 33 feet from the end) appears somewhere in its training data? That’s ludicrous, and in any case it’s clear that the AIs aren’t simply looking up an answer — they’re reasoning their way to it.
Some of them, including Claude, now provide an expandable “thought process” window that shows you exactly how they arrive at their responses. It’s a great feature, and when you look at Claude’s window he:
1. Computes the potential energy of the bowling ball when it’s at the top of the ramp.
2. Contemplates the fact that the ramp is frictionless and understands that there will therefore be no torque on the ball as it descends.
3. Knows that with no torque, the ball won’t rotate. It will slide, not roll. This is despite the fact that in my prompt, I describe the ball as rolling. Claude doesn’t fall for it.
4. Observes that since the ball won’t roll, all of the potential energy is transformed into translational kinetic energy and none into rotational energy. That’s important because it affects the ball’s velocity as it leaves the ramp and therefore how far it travels before landing.
5. He needs to figure out the ball’s velocity as it leaves the ramp. He equates the initial potential energy (
) to the final kinetic energy
and solves for v, observing that the mass cancels out. The final velocity is
. It doesn’t matter how much the bowling ball weighs — it will travel the same distance regardless, so he can ignore the fact that it weighs 10 pounds.
6. Using the equation he derived, he plugs in the values of g and h and obtains v, the velocity with which the ball leaves the ramp. Since the height is expressed in feet, he’s smart enough to use
and not
for g.
7. He knows the ball is initially traveling upward at 30° as it leaves the ramp. That allows him to decompose the velocity into a horizontal component
and a vertical component
.
8. The vertical component determines how long the ball will remain in the air, so Claude uses it to calculate the time of flight, taking the force of gravity into account.
9. Knowing the time of flight, he can calculate the horizontal distance traveled by multiplying it by the horizontal component of the velocity.
10. He knows that if the ball hits the egg carton, it won’t simply bounce off. It will affect the final state of the eggs — they’ll be crushed.
11. He determines that given its trajectory, the ball will strike the floor past the place where the egg carton is located.
12. However, he also knows that the ball and the egg carton aren’t point masses. They occupy space, so in order to figure out what happens to the eggs, it isn’t sufficient to calculate the ball’s landing spot. He needs to know how much clearance there is (if any) as the ball passes over the eggs.
13. He crunches the numbers and determines that the center of the ball will be about 11 inches above the floor as it passes over the egg carton. He knows that a bowling ball is about 8 1/2 inches in diameter and that an egg carton is 2-3 inches tall, and that enables him to determine that there is sufficient clearance. The ball doesn’t strike the egg carton.
14. He excitedly announces:
15. In one of the chats, he adds the flourish of describing exactly how the ball behaves after striking the floor:
Claude’s reasoning is frikkin’ impressive, and it’s not a canned response plagiarized from an example in its training data.
For fun, I asked one of the Claude instances to plot the trajectory of the ball vs time. I watched it generate and run the Javascript code. The code was flawless, and this is the graph it generated:

Erik,
Here’s a paragraph that Claude wrote describing the rules of Tic-tac-toe:
Now observe how many times the word “the” appears in that description. Not even once. Why? Because I used this prompt:
Notice how smoothly it reads, despite that constraint, and ask yourself: did Claude have to comb through his training data, or scrounge around on the web, in order to find a description he could plagiarize that lacked the word “the”? Of course not. He generated the description instead of plagiarizing it, and he did a damn good job of making it scan smoothly, not awkwardly. He needed to satisfy two constraints, and he generated original content in order to achieve that.
Here’s his thought process: