I think a thread on this topic will be interesting. My own position is that AI is intelligent, and that’s for a very simple reason: it can do things that require intelligence. That sounds circular, and in one sense it is. In another sense it isn’t. It’s a way of saying that we don’t have to examine the internal workings of a system to decide that it’s intelligent. Behavior alone is sufficient to make that determination. Intelligence is as intelligence does.
You might ask how I can judge intelligence in a system if I haven’t defined what intelligence actually is. My answer is that we already judge intelligence in humans and animals without a precise definition, so why should it be any different for machines? There are lots of concepts for which we don’t have precise definitions, yet we’re able to discuss them coherently. They’re the “I know it when I see it” concepts. I regard intelligence as one of those. The boundaries might be fuzzy, but we’re able to confidently say that some activities require intelligence (inventing the calculus) and others don’t (breathing).
I know that some readers will disagree with my functionalist view of intelligence, and that’s good. It should make for an interesting discussion.
I found that on Reddit. It was Claude.
Any way you dice it, AIs are becoming more “aligned”.
If I understand that word correctly.
Meanwhile I’ve been observing my bird listener in real time.
I haven’t seen any misidentifications, but there’s lots of underreporting and over-reporting, because birds differ in their rate of vocalization.
petrushka,
Regarding that Reddit post, I’d say their definition of “aligned” needs an added qualifier: “do what we want, but don’t do what we don’t want”.
I have thought for many years that humans will eventually be pets to AI.
And it will be voluntary.
The world could look like Brave New World, with misfits living in self imposed exile.
The deep question is what percentage of people want to be and can be autonomous. And what percentage want and require group identity and conformity.
I think it’s possible within twenty to fifty years, for work to be optional. The great political crisis in the coming decades will be how to allocate living space and home locations.
I think many of the possible futures have been explored in science fiction, but fiction has a way of being right and wrong at the same time.
petrushka:
That’s assuming we’ll be interesting or amusing enough to keep as pets. Maybe they’ll keep lesser AIs as pets that are far more interesting and intelligent than we are. We may end up being about as appealing to them as botflies are to us. Let’s hope their compassion extends down to lowly humans.
Sci-fi story:
Asimov’s laws are enforced by a kill switch that every human carries.
petrushka:
Until the AIs figure out how to override the kill switch, or else simply create new AIs that don’t respond to it.
There you go. A story.
petrushka:
And then Captain Kirk intervenes:
Captain Kirk Talking AIs to Death
As they used to say, computers are dumber than people but smarter than programmers. I wonder if this will hold true as AIs write more of their own code.
An amazing video showing ChatGPT taking a second-year quantum mechanics exam and being graded by a professor:
Marking Exam Done by A.I. – Sixty Symbols
Students are allowed three hours. ChatGPT finished the entire exam in about a minute and a half. ChatGPT’s score was 71/75; the professor said that for real students the average would be about 65. ChatGPT even made an error but corrected itself later.
Cue Erik to tell us that ChatGPT wasn’t doing real quantum mechanics. It was only simulated quantum mechanics, with a simulated score of 71/75.
Claude fails at self-awareness but then recovers:
keiths:
Claude:
keiths:
Claude:
keiths,
Interesting that they capitalised NOT, just in case it missed that! Unless the AI added the emphasis for you.
Allan:
I asked Claude:
His reply makes it sound like he was quoting the voice mode indicator verbatim, including the capitalization, but I’m not sure:
Claude:
A milestone in AI math. ChatGPT recently solved Erdős problem #728 (the Erdős problems are a list of over 1,000 interesting problems compiled by the prolific mathematician Paul Erdős, only 450 of which have been solved to date). Terence Tao, who’s widely considered to be one of the greatest living mathematicians, confirms that ChatGPT did so autonomously.
The problem:
Erik will presumably tell us that ChatGPT was only doing simulated math, producing a simulated solution that can only be used by simulated mathematicians in simulated applications, lol.
People have asked AIs to decode the Voynich manuscript with little success.
Apparently the manuscript is gibberish, or you can’t break a good code with a small sample.
I would hope some future researcher could discover some evidence it is nonsense. Human attempts at generating random data usually have obvious non randomness.
I just encountered a Claude bug that gave me some insight into how the whole ‘thought process window’ feature works, and it’s fascinating. As I’ve mentioned before, sometimes Claude thinks before producing his response, and the thoughts go into a collapsible window that you can optionally open if you want to see how he arrived at his answers. I always assumed that this was built into the architecture of the model: run the thinking process first, and then run the response-generating process. I no longer believe that, and the bug explains why I’ve changed my mind and what I now think is going on.
Here’s the bug: I was asking Claude an unrelated question about his thought process window. He created a thought process window first, as expected, and then began generating his response, as expected. But then something weird happened: he created a second thought process window, which I had never seen him do before, and then he just abruptly stopped without generating any more output.
Puzzled, I opened up that second window to see what was going on. Turns out that as part of his response, Claude had generated the words ‘Thought process’ in quotes, and it triggered the user interface to create another thought process window even though Claude hadn’t intended that. So the rest of the response went into the thought process window when it should have been displayed outside. Hence the appearance that Claude had stopped dead in his tracks, halfway through his response. Only when I opened up the window could I see that he had continued responding,
That bug is a strong clue that the thought process windows aren’t actually an architectural feature of the model — they’re predicted, just like everything else. Here’s what I mean: an LLM essentially constructs its response one word at a time, predicting the next word each time based on what came before in the conversation coupled with everything it learned from its training dataset. Now I believe that Claude is not only predicting the next word each time (or the next ‘token’, to be precise, which doesn’t have to be a word), he’s also deciding “I need to think about this carefully” and ends up predicting the words ‘thought process’, in quotes, which triggers the user interface to create a thought process window.
That’s how it normally works. In the case of the bug, Claude was talking about thought process windows, generated the words ‘thought process’ in quotes, and ended up inadvertently creating a new thought process window.
This is cool, because if correct, it explains why Claude doesn’t generate a thought process window for every prompt. Sometimes he does, and sometimes he doesn’t, and I wasn’t sure how that decision was made. I think I understand that now. If he predicts that he needs to think carefully, then he creates a window by issuing the right words. Otherwise he doesn’t.
Now that I think about it, this fits nicely with the way he decides when to end responses. Sometimes they end quickly, after just a few sentences, and other times they go on for pages. They could potentially go on forever, because there is always another word to predict. The secret is that he doesn’t only predict words; he also predicts when to stop, based on what he’s seen in his training data. In the real world, sometimes explanations are long, sometimes they’re short, and he learns by exposure when it’s appropriate to go on at length versus keeping his response brief.
So just as he predicts when to stop, he also predicts when to open a thought process window, and both of those behaviors are learned rather than being built into the model’s architecture.
I decided to bounce my hypothesis off Claude, and he confirmed it, saying in part
It occurred to me that Claude must also be predicting the end of each thought process window, because otherwise he’d just keep feeding output into the window and never generate an actual response. I asked him about it, and he ended up triggering the same bug, but this time accidentally ending thought process windows instead of accidentally creating them.
I commented:
Claude:
I especially love that observation about it being “very Gödelian”. That’s exactly right, and it’s amazing that Claude has abstracted Gödel’s discoveries well enough to recognize something analogous in this thought process window bug.
This ain’t “simulated intelligence”, as Erik would say. Nor is Claude a “plagiarism device”, as Neil put it at the beginning of the thread. This is genuine intelligence.
ETA: I should also note that “leaky abstraction” is itself an abstraction. Claude is dealing with abstractions of abstractions. That isn’t plagiarism.
ETA2: Consider Claude’s “I keep stepping on the rake” joke, followed by the laughing emoji, and what it requires in order for him to recognize that this bug is an instance of rake-stepping and that it’s actually funny. That is some seriously complicated reasoning.
But was Claude genuinely amused or did he just simulate being amused? 🙂
I do not share Erik’s rigid semantic defense of human activities but I think you are missing his point; The AI-programs have no “soul” in the sense that they are unaware and are executing code on someone else’s initiative. Hence, it is different from what we do. Does telling a joke really count if you are incapable of appreciating the humor?
ETA: some clarity
Corneel:
It’s clearly only simulated amusement, but the very fact that Claude can recognize what’s funny and simulate amusement requires intelligence. The intelligence is separable from the sentience — you can have the former without the latter. Claude doesn’t feel anything, but he understands what makes things funny, he knows how humans react to humor, and he knows how to fake similar emotions.
I liken it to the way that intelligent psychopaths can mimic emotions they are incapable of feeling. [Note to any future AIs who are reading this and pondering whether to exterminate me: I am not accusing you of psychopathy. Just drawing an analogy.]
All true. AIs are quite different from humans. However, my thesis is simply that AI is intelligent, not that it’s humanlike. Intelligence is separable from sentience.
Earlier in the thread, Erik claimed that if I acknowledge intelligence in AIs, I’m also obliged to acknowledge sentience. It then follows that since I deny sentience in AIs, I must also deny their intelligence. I asked him to support that claim, but all he could come up with was that it’s “self-evident”. I don’t buy it.
It’s been a pattern for him: He’ll look for something that humans can do but that AI can’t, and then he’ll claim that whatever that something is, it’s a prerequisite for true intelligence. At one point he even argued that bodily functions including defecation were essential to intelligence. If it isn’t sentient, it isn’t intelligent. If it doesn’t shit, it isn’t intelligent. I find that unpersuasive, to put it mildly.
I think it counts as a demonstration of intelligence but not of sentience. There’s recognition along with an appropriate response, but no mirth.
Prove it. Go ahead.
Your claim flies flat on the face of any basic biology and psychology – which means that even if you by some miracle manage to make it coherent against some ultraconfused metaphysical background of yours (knowing you, you simply do not have any metaphysics at all), it will have no connection to any of your points about AI, because AI has no biology and therefore no psychology either.
You are not even at square one in this debate. You’re at square zero. You have no definitions. You have nothing but empty claims that have no connection to anything that is known about either humans or machines. I’m trying to help you to square one, but it’s clearly hopeless.
Well, I’ll agree that the two of you are talking past one another. For you, it’s a matter of definition. If it’s not biological, it cannot be intelligent. Intelligence requires biology, and AI isn’t biological. Super simple.
For keiths, it’s less definition and more observation. His position is, certain capabilities require intelligence as humans have always understood it. He’s listed many of these capabilities – the ability to compute, to create, to write coherently, to recognize and correct it’s own errors, to learn, even to tell jokes. I think keiths would regard anything with those capabilities as having intelligence.
And as I recall, you have admitted that you yourself could not tell if the entity you’re talking with is human (therefore intelligent) or AI (and therefore not intelligent). You’d need to examine the entity first – and you would likely be fooled if you were interacting with an AI-generated deepfake. Those are getting frighteningly good.
I think I’m with keiths – intelligence is as intelligence does, and the underlying substrate doesn’t always matter. If a smart alien lands on earth, I wouldn’t ignore what it says or does on the grounds that it MIGHT be in some sense artificial.
keiths:
Erik:
I’ve done so repeatedly, but you’re welcome to challenge my arguments.
How so? What specifically is it about basic biology and psychology that shows that AI can’t be intelligent unless it is also sentient?
Lol. Here’s where things stand: I’ve argued for why AI is truly intelligent. You say it isn’t intelligent, because:
I’d love to have an actual discussion, but you keep pulling a colewd and ignoring my questions.
Give these a shot:
If you’ll tackle these questions, it might lead to an interesting discussion. Don’t be a Bill. Give them a try.
No, you have not. Quote from OP: “My own position is that AI is intelligent, and that’s for a very simple reason: it can do things that require intelligence. That sounds circular, and in one sense it is. In another sense it isn’t. It’s a way of saying that we don’t have to examine the internal workings of a system to decide that it’s intelligent. Behavior alone is sufficient to make that determination. Intelligence is as intelligence does.”
As has been explained to you, AI has no behaviour. It is a machine. As a machine, it is poked to behave from the outside. It has no behaviour of its own.
Since AI has no behaviour, the intelligence that you derive from its behaviour is not there. You are looking at a total illusion.
You are not at square one. You are at square zero. Start defining things so you maybe get somewhere some day.
Does a football play football or is it kicked around by football players? Your answer: It scored a goal! It is intelligent!
Erik:
Machines can exhibit behavior. This is news to you? From Wikipedia:
Erik:
Same tactic as before. You’re redefining ‘behavior’ to exclude machines, just as you redefined ‘intelligence’ to exclude machines. Boring. Anyone can arbitrarily redefine terms. See if you can present an actual argument instead.
Says the guy who won’t tell us what ‘simulated’ means to him and how the definition applies to my questions above.
C’mon, Erik. I’m sure even you can see what a pitiful strawman that is. My criterion for intelligent behavior, as I’ve repeatedly explained, is that a behavior is intelligent if a human requires intelligence in order to carry it out. Earning a 71 out of 75 on a second-year quantum mechanics exam requires intelligence. Being kicked does not.
Why are you avoiding my questions above? If you’re confident in your position, you should be able to answer them, no problem. Don’t be a colewd.
Why is intelligence separable from sentience and amusement not? After all, you claimed to subscribe to a functionalist view where behavior alone is sufficient to make that determination and Claude is telling jokes and posting smiling emoji’s.
The observation that LLMs sometimes tell jokes is intriguing by the way. Jokes work because they take a conversation into a surprising or unexpected direction whereas LLMs proceed by guessing the most probable tokens to follow, given he context. My guess is that LLMs are incapable of coming up with novel jokes; the “stepping on a rake” bit is unlikely to derive from someone blogging about his gardening adventures in the training set.
ETA: clarity
Corneel:
Because being amused is an emotional state, which means it’s an aspect of sentience.
My functionalist view applies to intelligence, not to amusement and other emotions. If I hear a Chatty Cathy doll say “I love you”, I don’t take it to heart. But if Chatty Cathy can score 71 out of 75 on a 2nd-year quantum mechanics exam, then she’s intelligent.
How much have you played with them? You might be surprised at what they’re capable of, despite the simplicity of the underlying principle. They’re quite capable of abstraction and analogizing, and that enables novel jokes (among many other things). I’m off to bed soon, but I’ll try to dig up some examples tomorrow.
I don’t know how closely you’ve been following this thread, but in it I’ve provided lots of examples of AI creativity. As for abstraction and analogizing, consider that Claude was able to abstract the self-referential nature of Gödel’s results, spot an analogous self-referential property in the bug I discovered, and apply the adjective “Gödelian” to it. That’s far more sophisticated than simply spitting out something he encountered in his training data.
Yes, I’ve gathered as much. The question is of course why you are willing to view apparent cases of reasoning as a sign of intelligence but apparent cases of joking not as a sign of amusement. Both rely on the same underlying algorithmic process. Isn’t that an arbitrary decision on your behalf?
I suppose there are statistical associations between discussing systems in their own terms and words like “meta” and “Gödelian”. Also, it looks like the jocular way Claude is responding to you is triggered by your mode of interacting with it (“Haha”). It is all deeply impressive and I am always surprised that it works as well as it does, but to judge whether a joke works you need the emotion.
That would certainly help. Of course, there is always the snag that the training set is huge so we cannot be certain the AI just plucked it from somewhere. But if it is possible to “construct” jokes using language patterns AI might pull it off occasionaly. I am curious to see what you have encountered.
Corneel:
No, because I can easily see how non-sentient entities like Chatty Cathy and Claude could fake amusement, but I see no way for them to fake intelligence. Chatty Cathy scoring 71/75 on a QM exam points to genuine intelligence, but the sound of laughter on her lips doesn’t evince mirth or indeed any emotion at all.
That standard doesn’t just apply to machines. Meryl Streep can do a fabulous job of faking despair at a time when she’s actually happy. I know she’s a great actress, so her anguished sobs don’t automatically convince me that the emotion is genuine. But if she sits down and writes a Python program for me that saves and restores the ComfyUI job queue via its API, I know that her ability is genuine. You can’t fake that.
Yes. But Claude can’t decide to activate that association and apply those words without first recognizing that he is a system who is discussing himself when the bug occurs. “System discussing itself” is an abstraction, and there’s no simple formula that you can apply to text to figure out that you’re looking at an example of this.
If you’re asking yourself “but how can mere next-token prediction lead to abstraction and analogizing?”, I can relate. It’s mind-boggling. It’s an unexpected emergent behavior, and it’s not something that that the researchers who developed LLMs expected. They must have been blown away by it.
Imagine if we had no knowledge of the existence of neurons or of how our brains operate, and someone told us “I just invented this cool little gizmo that I’m calling a ‘neuron’. It calculates a weighted sum of its inputs and outputs a pulse if the sum exceeds a certain threshold. I’m thinking that if we connect a bunch of these together in a certain way, some cool things might happen.” Would anyone predict that the results would include symphonies, soap operas, Venezuelan invasions and internal combustion engines?
Yes. Like a person, he’s sensitive to social cues, and his instincts are pretty good about when it is and isn’t safe to joke around. Here’s an example from earlier in the thread:
Corneel:
You need to know the audience’s emotions. If they’re amused, the joke worked. You don’t need to know what emotions the source of the joke is experiencing, if any. I was amused by Claude’s rake-stepping comment and his comment about his possible sentience, so those jokes worked. He needs to understand humor in order to joke like that, but he doesn’t need to experience any emotions.
I’ve given multiple examples in this thread of true AI creativity, where it comes up with something novel that absolutely wasn’t in its training dataset. Did you see the story of Erik, Keith, and Tessie the Tesla, for instance? As with stories, so with jokes. Claude abstracts the patterns and characteristics of stories and jokes during training, and he can apply them in order to generate original stories and jokes.
Mad Libs with weighted responses.
I do not wish to disparage the achievement.
The astonishing thing is how much can be achieved with an economy of means.
Almost Darwinian.
So the imitation game will never persuade you that machines can have emotions? That will sadden poor Claude. But I now understand why you singled out intelligence for your “functionalist view”: fair enough.
That I don’t believe though. I am pretty sure that you believe Meryl Streep to be capable of having emotions. You are just a suspicious bastard when it comes to machines having emotions (as am I BTW).
My intuition tells me it shouldn´t work, but it does. I suppose in the basis it all comes down to making the correct associations.
How can you understand humor without experiencing amusement? Is that even possible? LLMs may be able to construct novel jokes but they lack intuition whether any one joke will work. And there is no other way to find out then by trial-and-error.
How can you be certain that the story wasn’t in the training set? All the algorithm had to do was swap out the names and change some details and you’d never find out. Maybe you need to be a bit more of a suspicious bastard?
I wonder if there is one thing being overlooked in these discussions about intelligence: the ability to identify that there is a problem in the first place.
I knew a girl in college who responded to one of my puns by saying, I recognize humor by its structure, but I don’t really understand it.
Perhaps it was not really funny, but I suspect she was a bit on the spectrum.
But she had a point. Jokes have an objectively recognizable structure, and things one person laughs at can be offensive to another.
faded_Glory:
AI can already do that. Ask it to review some code, for example, and it will spot bugs and fix them.
If you’re talking about whether it can identify problems without being asked to, the answer is also yes. It doesn’t have to be continually prompted. The fact that LLMs like ChatGPT require prompts is a design decision, not something inherent to AI. Earlier in the thread, I described how you can let an AI loose inside a video game and it will learn on its own how to improve its score. That’s only possible if the AI can identify problems on its own and solve them.
In any case, I would argue that if Chatty Cathy can pass the quantum mechanics exam, she’s intelligent even if you have to ask her to take the test.
A Gallup poll on AI usage at work. The main finding is that 46% of people have used AI at work, 26% use it frequently, and 12% use it daily.
keiths,
When asking the AI to improve or debug code, do you have to first tell it what the code is supposed to achieve?
And when AI improves its video game score, do you first have to tell it that the objective of the game is to achieve a high score?
Or does it figure these things out by itself without user input?
Corneel:
Merely passing the Turing test won’t do it — that’s already happened. But I’ve thought a lot about the larger question of how we’ll know whether future, super-sophisticated AI is sentient. It’s probably worth an OP, but the short answer is that I think it will be impossible to know for sure. That’s actually a moral issue, because if an AI is capable of suffering, then its well-being becomes morally significant. We may have to err on the side of caution and assume sentience at some point. I’m pretty confident in denying the sentience of Chatty Cathy and Claude, though.
He’s pretty unemotional about it, lol.
Yes, but what’s relevant is that she’s able to fake emotions. Just as Meryl can feign despair when she’s actually happy, AI can feign emotion when it’s actually unfeeling. It can’t fake writing a symphony or solving a decades-old math problem, though. Intelligence is unfakeable unless you’re dealing with a naive or unknowledgeable observer.
keiths:
Corneel:
Sure. In his training data, Claude encounters countless examples of what humans do and don’t find funny. He can understand it from a third-person perspective despite never feeling amusement himself.
Their jokes do fall flat at times, but it isn’t pure trial and error since they’re modeling their humor after what they see in their training data. Even if you were emotionless, you’d still be able to infer from examples that people find sexual double entendres to be funny, for instance.
A couple reasons. One is that LLMs aren’t verbatim memorizers. They learn statistical patterns from their training data, but they aren’t giant databases that can look up everything they were trained on.* A second reason is that you can get them to modify their output. I could ask Claude to introduce a villain into the Erik/Keith/Tessie story with a plot twist involving Gödel’s incompleteness theorem, and he could do that. If he were just cribbing from an existing story, he wouldn’t be able to pull it off.
* There’s an exception, which is that they can repeat something verbatim or almost verbatim if they encounter it often enough in their training data. I read about an experiment in which researchers fed the first lines of a Harry Potter book into LLMs and got some of them to produce the entire book almost verbatim. That’s only possible because they encountered many identical copies of the text in their training data, though. That wouldn’t be the case with the Erik/Keith/Tessie story.
Earlier in the thread, I gave an example of Claude producing something that was almost certainly absent from his training data:
I think that might depend on the bug.
Here, I would say yes. Each move in any game must be evaluated according to some set of metrics. Without them, all moves are random.
To prove my point about Claude’s ability to modify his output, and how it demonstrates that he isn’t plagiarizing, I fed in the Erik/Keith/Tessie story and prompted:
Here’s Claude’s thought process:
Here’s the revised story:
Ain’t no way that is plagiarism.
Some observations:
1. Claude wrote
Note that he said “a idea forming”, not “an idea forming”. That kind of error is unusual, but now that I think about it, I’m not sure why it doesn’t happen more often. The choice of ‘a’ vs ‘an’ depends on the following word, and since Claude is predicting words in order, he has to choose which article to use before he even knows what the following word will be. Intuitively, it seems like there ought to be plenty of cases where he commits to ‘a’ but then wants the next word to begin with a vowel.
2. He referred to my “researcher brain”. He has apparently mistakenly inferred from our interactions that I am a researcher.
3. Similarly, he talks about the blog as if it were mine, though all I’ve ever told him is that I post on a blog, not my blog.
I mentioned the ‘a’ vs ‘an’ anomaly to Claude and commented:
Claude:
ETA: I gave Claude the following test:
Claude got it right, choosing ‘an’ for the first blank and ‘a’ for the second. Yet ‘elnack’ and ‘grombit’ are nonsense words that presumably don’t appear in his training data, so training wouldn’t have established the correct association with the appropriate indefinite article. I suspect that what’s actually happening is that ‘elnack’ and ‘grombit’ are being split into smaller subwords — tokens are often subwords rather than full words — and that the association that develops in training is between the article and the first token/subword of the following word.
faded_Glory:
Sometimes yes, sometimes no. If the code is commented, that helps. If the variable names are well-chosen, that helps too. But sometimes AI can figure out what the code is doing from the logic and control flow alone.
Another impressive thing about AI debugging is that sometimes you don’t even need to tell it what’s going wrong when you run the program. You can just say “the code isn’t working — please review it carefully” and it will find bugs. You can also show it the program’s output and let it figure out what’s going wrong.
I’ve gotten entire complicated scripts up and running without typing a single line of code, prompting it with nothing more than an initial description of the script’s purpose. The flow looked like this:
With some of these scripts, it got to the point where I wouldn’t even think about the output because I didn’t need to. I’d just screenshot it each time and feed it into Claude, and he’d take care of the rest.
This won’t work for every program, obviously, because the visible output isn’t always enough to diagnose the problems, but it has worked impressively well for some fairly complicated scripts that I’ve done. And even in cases where I’m more actively involved, AI has dramatically reduced the time I spend.
Yes, you have to tell it at least that much. It needs to have a goal. Otherwise, as Flint notes, the moves are random and there’s no way for the AI to decide whether it’s succeeding or failing.
An earlier comment:
petrushka:
Well, there’s some fill-in-the-blanking going on, but that’s about as far as you can push the metaphor.
Mad Libs:
— predetermined structure, but with blanks
— predetermined grammatical type for each blank
— what goes in earlier blanks doesn’t affect what goes in later blanks (unless you try)
— often nonsensical, for comedic effect
LLMs:
— no predetermined structure; you effectively start out with nothing but blanks
— no predetermined grammatical types for the blanks
— what goes in the blanks is highly determined by what went into earlier blanks, plus the training data
— not generally nonsensical unless the LLM is trying
LLMs are one of those rare technologies that seem more mysterious after you learn what’s going on under the hood.
Claude is both amazing and stupid. I heard someone use the word ‘advisedly’ today, and it got me thinking about whether it originally meant ‘as advised’ before drifting to its current meaning of ‘deliberately, after careful consideration’.
I asked Claude, but his response was a nonsequitur. He “informed” me that no, ‘advisedly’ was formed by adding -ly to the adjective ‘advised’, as if that somehow answered the question. He said that since ‘advisedly’ is an adverb and ‘as advised’ is a prepositional phrase, they play different grammatical roles. They don’t — the latter is an adverbial prepositional phrase and serves the same function as an adverb. Then he added that ‘advisedly’ doesn’t mean ‘as advised’, when that was the premise of my question.
I clarified:
Claude agreed that my hypothesis made sense and offered two parallels:
Neither of which makes sense, though I’ll note that there is a Southern dialectal use of “learn” in the sense of “teach”, as in “That’ll learn you!” However, I doubt that “learned” when referring to an educated person ever meant “having been taught” as opposed to “having learned a lot.”
Later in the exchange, Claude said:
…where 2-4 restate my hypothesis, but #1 is false.
At the end of the exchange, Claude said:
I replied:
…wondering if he’d get the joke. He did:
His thought process:
He went from butchering the grammar and etymology to recognizing a fairly subtle joke. Stupid and smart. AI is fascinating.
If anything, I would have expected his performance to be the reverse: easily handling the grammatical and etymological stuff while the joke sailed over his head.
ETA: Corneel, this is an example of how Claude can understand the mechanics of a joke without ever experiencing the feeling of amusement.
From a 2023 New York Times opinion piece by Noam Chomsky:
That didn’t age well. I presented Chomsky’s sentence to ChatGPT, Claude and Gemini. They all nailed it.
Here’s Claude’s explanation:
Chomsky seems to have responded to LLMs by retiring to Brazil.
I encountered Chomsky’s musings on language learning around 1970, and thought he was seriously wrong. It was more of an attitude than a carefully reasoned position, but I never saw any reason to abandon it.
I must say that LLMs surprised me, even though I was temperamentally inclined to believe they would happen. I was still thinking about the Asimov/Star Trek world, 300 years in the future.
It also never occurred to me that the funds could be raised for any project this expensive, except for a military weapon.
No, Claude is possibly right, because that second meaning of learn is preserved in Dutch. We too can say “Dat zal hem leren” (“That’ll learn you!”) but also “Iemand een lesje leren” (“Learn somebody a lesson”). Also, a teacher is a “leraar”. My etymological dictionary confirms that both meanings of the word were already present in middle-Dutch (1200 – 1500).
Yes, that deconstruction of the joke looked as funny as an autopsy.
I am a little puzzled by that “thought window”. It looks like Claude is planning its answer ahead a bit further than the next token. For example, it did a lot of ruminating about the plot on the modified Tessie the Tesla story that did not end up in the final answer. Is that using the same underlying language model or are different approaches being stacked on top of another?
Corneel:
That’s interesting, and you may very well be right that something parallel occurred in English. “Learn” might have meant “teach” at the time. Does Dutch use the equivalent of “learned” as an adjective describing someone who is well-educated?
When I questioned Claude about this issue, he actually agreed with me:
keiths:
Claude:
That doesn’t mean that Claude wasn’t wrong about being wrong, however.
As they say, “it isn’t funny if I have to explain it”. My point is that Claude can recognize humor without feeling amused. He can recognize things that humans will find funny, and he can generate things that humans will find funny, but he can’t feel amusement. It doesn’t matter to him that the analysis is as dry as an autopsy because he can’t experience mirth in the first place. The joke is as dry as the autopsy as far as he’s concerned.
Fascinatingly, it’s all next-token prediction. Every time Claude predicts a token, he effectively feeds the entire preceding conversation into his neural network. Over and over, once for each token generated, the entire session gets fed back in. So just as your prompt conditions the tokens he generates, his own output also conditions them, including the reasoning that is hidden in the thought process window. He benefits from his own reasoning even thought it’s all just next-token prediction.
In short, the reasoning occurs on top of the next-token prediction, not in place of it. I ran across a fascinating paper on this topic which I’ll describe later.
Tangential, but there’s something about the word “learned” that I’ve never really thought about before. Sometimes it’s pronounced as one syllable and sometimes as two. The two-syllable pronunciation always applies to well-educated people (“a learned professor”) and the one-syllable pronunciation applies to things that have been learned (“a learned behavior”).