Is AI really intelligent?

I think a thread on this topic will be interesting. My own position is that AI is intelligent, and that’s for a very simple reason: it can do things that require intelligence. That sounds circular, and in one sense it is. In another sense it isn’t. It’s a way of saying that we don’t have to examine the internal workings of a system to decide that it’s intelligent. Behavior alone is sufficient to make that determination. Intelligence is as intelligence does.

You might ask how I can judge intelligence in a system if I haven’t defined what intelligence actually is. My answer is that we already judge intelligence in humans and animals without a precise definition, so why should it be any different for machines? There are lots of concepts for which we don’t have precise definitions, yet we’re able to discuss them coherently. They’re the “I know it when I see it” concepts. I regard intelligence as one of those. The boundaries might be fuzzy, but we’re able to confidently say that some activities require intelligence (inventing the calculus) and others don’t (breathing).

I know that some readers will disagree with my functionalist view of intelligence, and that’s good. It should make for an interesting discussion.

297 thoughts on “Is AI really intelligent?

  1. Flint: No, that title went to the HP FOCUS, released in 1982. The Motorola 68020 was released in 1984. Intel’s 386 came out in 1985.

    I’m not sure we are entirely there yet. My reading is that AI driven cars tend to react properly more often than people do in most “normal” edge cases, but the worst results happen when those cars do “dumb” things people would never do.

    Here’s a scenario for you: You’re driving down a 2-lane highway and an oncoming car swerves into your lane, threatening a head-on collision. What should you do? Swerve left to take his lane and stay on the road, or swerve right into the ditch? In practice, most people take his lane to stay on the road, only to have the oncoming driver realize what he’s doing and jerk back into his lane – and right into you! This happens enough so that there is legal precedent – if you try to take his lane, you are responsible for the collision (since it happened in his lane). I wonder what an AI driven care would do, but most likely is programmed to drive off the road to the right.

    Tesla owner posts video of full self-driving mode in action; avoids head-on crash

    Recorded instances are rare.

    But several things need to be said: the car can see in all directions and can quickly evaluate possible escape routes. The specific options are not programmed. The scenarios are trained, and no one can predict the actual action taken.

    Another point: the cars are constantly evaluating distant objects, and in actual cases, avoid getting into desperate scenarios. There are dozens of videos of situations that could be tragic, but are avoided so smoothly that humans may not even realize the problem.

    Then there are scenarios where no effective action is possible. I took a defensive driving course some years ago, and we were told to avoid head on collisions at all cost, even if it meant steering into a solid object.

    Simple crash avoidance systems have been around for a while. Statistically, they are much better than humans. AI is better, and it is improving quickly.

  2. One other thing: Tesla has been updating software frequently this year. They are able to take incidents from beta testers and distribute updates in a week or two.

    I’m aware of one recent head on collision between a Tesla truck and A BMW driving on the wrong side at high speed. Only ten percent of Tesla owners have FSD, and not everyone has it activated all the time.

  3. I noticed something interesting. If you look at the initial, fully randomized noise at the beginning of the sequence above, there happens to be a dark patch, which I’ve circled here:
    Comfy-UI-00470-dark-spot-Phone

    Her eye ends up developing in that spot. You can tell it’s the same spot by noting the distinctive yellow squiggle that’s above it in both of these images:
    Comfy-UI-00470-Phone
    Comfy-UI-00455-Phone

    That’s interesting, because knowing how diffusion models work (which I’ll explain in a future OP), I can see how it would be tempted to put a dark feature in a spot that was already dark in the original random noisy image.

    Is that what’s going on here? I don’t know, but perhaps I’ll do some experiments to see if I can doctor some original pure noise images in order to coax the model into putting features at predetermined locations.

  4. keiths:
    Flint,

    I get all that, but what I didn’t get was this. You wrote:

    Why would they believe that an OS wouldn’t be possible? If writes to the MSW (that’s what it was called on the 286) are privileged, then an OS can control the transitions to real mode and prevent user programs from doing it on their own.

    As I mentioned, I saw OSes do it myself while I was debugging processor problems.

    I read that the designers of the 286 were concerned that a malicious program able to get control in real mode could rewrite the interrupt and other tables, or abuse call gates, or otherwise trash (or even become) the OS. Which is why the BIOS had to go through all that exercise to test memory and support VDisk via the 8042.

    I understand that you have seen protected mode OSes switch to real mode, but I’m pretty sure that that wasn’t possible on the 286. If the OS running on the 286 could “transition to real mode” the BIOS could have done it also.

  5. keiths: No, he’s talking about the rather obvious difference between a) a human selecting the target and b) the drone selecting the target.

    Obviously there is no such difference.

    There was an instance in Ukraine war when dormant AI-driven drones were transported (by unsuspecting Russians) close to several targets, then at a given moment the drones broke out of their packaging and started flying around. The moment was pre-programmed – the same moment for the entire fleet of drones. The target areas were pre-determined coordinates, and upon arrival the drones had to identify specific military airplane-like, tank-like and other such objects to detonate themselves on. This is as close as drones became to “selecting targets themselves” – frankly not at all. And if it were any other way, it would be a scandalous war crime.

    It is astonishing how little both of you know on this topic. Seems like you have been through intensive unlearning courses and excelled at that.

    petrushka: literally everything you say about self-driving is catastrophically wrong. You swallow market hype uncritically and you are not allowing real-life user feedback correct you.

  6. Erik:
    It is astonishing how little both of you know on this topic. Seems like you have been through intensive unlearning courses and excelled at that.

    I’m reminded of the joke of the lady watching the parade and noticing that “everyone in the whole parade is out of step except my son – and that includes the drummers!” I guess nobody but Erik can see the obvious – even those who have long professional careers in the discipline!

    So we’re back to Dawkins:

    No evidence, no matter how overwhelming, no matter how all-embracing, no matter how devastatingly convincing, can ever make any difference.

    In the face of this position, even Dawkins was helpless.

  7. Flint,

    If you got something to refute then why don’t you? Because you got nothing, that’s why.

    keiths and petrushka have abandoned their expertise, if they ever had any in the first place. They don’t know what simulation is, they don’t know what software is, and, as it turns out, they also don’t know what hardware is. They don’t know how any of these things work, either in broad principle, technically, or legally. Do you? Can you post a fact for a change? For now, I’m the only one who cited actual facts in this thread instead of blather.

  8. Erik:

    The target areas were pre-determined coordinates, and upon arrival the drones had to identify specific military airplane-like, tank-like and other such objects to detonate themselves on. This is as close as drones became to “selecting targets themselves” – frankly not at all.

    Identifying potential targets and then picking one to go after is selecting a target.

    It is astonishing how little both of you know on this topic. Seems like you have been through intensive unlearning courses and excelled at that.

    It is astonishing to me that you can’t grasp petrushka’s simple point. Here are two scenarios that illustrate the difference:

    Scenario #1:
    You’re a soldier in combat. You see a squadron of enemy tanks approaching. You select a tank, point your Javelin at it, and fire. The Javelin hits the tank you selected.

    Scenario #2:
    You’re a soldier in combat. You launch an autonomous drone that has instructions to fly to a predetermined point and loiter. While it is loitering, a squadron of enemy tanks enters its field of view. It selects a tank, flies to it, and detonates.

    In scenario #1, the soldier selected the target. In scenario #2, it was the drone that selected the target. The soldier didn’t select a target, because he didn’t know what the available targets were or would be. He was depending on the drone to select a target, which it did.

    Here’s an analogy. The commander of a squadron of A-10s gets a radio call. Some ground forces are pinned down near Kandahar. The commander sends a pilot to that location to provide close air support. The pilot flies to that location, selects a target on the ground, and fires at it.

    In that scenario, who selected the target? Was it the squadron commander, or the pilot? It was the pilot, obviously. The squadron commander’s role was to give instructions to the A-10 pilot. The pilot’s role was to fly to the combat zone and select and destroy targets.

    The squadron commander is analogous to the soldier who launched the drone, and the pilot and his aircraft are analogous to the drone. It was the latter who selected the targets.

  9. Erik:

    keiths and petrushka have abandoned their expertise, if they ever had any in the first place. They don’t know what simulation is, they don’t know what software is, and, as it turns out, they also don’t know what hardware is. They don’t know how any of these things work, either in broad principle, technically, or legally. Do you? Can you post a fact for a change? For now, I’m the only one who cited actual facts in this thread instead of blather.

    You crack me up, Erik.

    Are you ever going to answer my question?

    You wrote:

    Therefore let’s define self-driving: It’s a car driving around without a human driver. It’s the way Waymo does it.

    I asked:

    Thank you for confirming it. That’s real driving, Erik, and Waymos do it. If the driving that Waymos do is real driving, then why is the story-writing that AIs do only simulated writing?

  10. Erik:
    Flint,

    If you got something to refute then why don’t you? Because you got nothing, that’s why.

    “No evidence, no matter how overwhelming, no matter how all-embracing, no matter how devastatingly convincing, can ever make any difference.” At first, I thought Dawkins was exaggerating. You have proved him right.

  11. Flint:

    I read that the designers of the 286 were concerned that a malicious program able to get control in real mode could rewrite the interrupt and other tables, or abuse call gates, or otherwise trash (or even become) the OS. Which is why the BIOS had to go through all that exercise to test memory and support VDisk via the 8042.

    But they made writes to the MSW privileged in protected mode. Since writes to the MSW are privileged, there is no danger of rogue programs trashing the IDT or other critical structures. The OS has full control of what code does and doesn’t get to run in real mode, and it can limit that access to trusted code such as legacy device drivers.

    It’s analogous to CPL 0. The OS has control of who gets to run at that privilege level, so there’s no danger of user programs mucking with sensitive structures like the IDT, the page tables, or the hardware itself.

    My point is that if you make MSW writes privileged, which is what the 286 architects did, then there is no reason to block the processor from entering real mode when PE is cleared. That’s why the 386 and beyond permit it.

    I understand that you have seen protected mode OSes switch to real mode, but I’m pretty sure that that wasn’t possible on the 286. If the OS running on the 286 could “transition to real mode” the BIOS could have done it also.

    Right. The 286 couldn’t do it, and that was a major architectural flaw. I’m just questioning the reason for that architectural mistake, since the architects were aware that writing to the MSW was a privileged operation. My best guess is that they didn’t think that re-entering real mode would ever be necessary, not anticipating that real mode would be needed to handle legacy device drivers and certain BIOS calls from legacy programs.

  12. petrushka:

    Beyond Weasel?

    [A note for anyone who is unfamiliar with ‘Weasel’. Weasel was a toy program written by Richard Dawkins to demonstrate the basic evolutionary principle of random variation and natural selection. It engendered some lively discussion between us and proponents of Intelligent Design, both here and at William Dembski’s Uncommon Descent blog.]

    Interesting question. There are some parallels and some disanalogies. Let me think out loud.

    Targets:

    Weasel:
    The phrase “Methinks it is like a weasel”.

    Evolution:
    No particular target other than improved survival and reproductive success. Improved fitness, in other words.

    Diffusion models:
    An image that could plausibly have come from the training dataset, based on its statistical characteristics, but didn’t. It’s not a fixed target, but it
    is similar to evolutionary fitness in that images in some portions of the target space are “fitter” than others.

    Mutations:

    Weasel:
    Random changes to the characters in the current phrase.

    Evolution:
    Random changes to the genome.

    Diffusion models:
    Directed mutations. Partly random, but biased in the direction of “fitness” by the output of the neural network.

    Selection:

    Weasel:
    The phrases that are “fittest”, meaning that they match the target phrase the closest, are selected to generate offspring, which are then mutated themselves, and the process repeats.

    Evolution:
    Pretty much the same as Weasel, which was why Dawkins wrote the program. Just that fitness is defined more broadly in nature.

    Diffusion models:
    No selection at all, and no reproduction. At each stage, you get what you get, and the neural network points you in the right direction to “move” your current working image in image space in order to get a better one.

    None of the three — Weasel, evolution, or diffusion models — do what became known in the discussion as “explicit latching”. That is, they don’t lock changes into place in order to prevent further mutations from “undoing” the beneficial ones. That was a hot topic in the Weasel discussion, because the IDers were erroneously convinced that Weasel cheated by latching, which is something that doesn’t happen in nature.

  13. The latching business was pretty funny, so I went back and googled parts of the discussion. I think my favorite bit was when the inimitable kairosfocus, having been shown that Weasel didn’t latch, insisted that it was “implicit quasi-latching”.

    Good times.

    ETA: kairosfocus is still at it, using the same turgid prose we found so funny:

    Yes, yes, we know. It is all too easy to play the trifecta fallacy game: a red herring side-tracks us, led away to a strawman soaked in ad hominems and set alight, clouding, confusing, poisoning and polarising the atmosphere, thus frustrating sober, responsible discussion…

  14. When you look at your example of evolving an image, consider that self driving computers do not have to be explicitly programmed. The actual process of training requires one of the largest supercomputers in existence, and the largest dataset in existence.

    Over the course of twelve years, the set of scenarios has been refined. It started with billions of miles of actual driving and has been refined. I’ve read that the current training data is synthetic, not because the situations are too complex, but because actual human drivers are too sloppy.

    Driving is a bit like Douglas Adams definition of flying: aim for the earth and miss.

    The critical part of driving is to aim for the destination, and avoid crashing.

  15. keiths:
    My best guess is that they didn’t think that re-entering real mode would ever be necessary, not anticipating that real mode would be needed to handle legacy device drivers and certain BIOS calls from legacy programs.

    I think you’re pretty much correct here. Seems clear to me that the 286 architects were clueless about the POST, but I think it went well beyond. My reading (long ago, but written by one of the 286 team) was that they figured the 286 would come out of reset and the OS would take control immediately. No POST, no DOS, no legacy programs, no device drivers not written for this hypothetical OS.

    This isn’t a stupid or far-fetched picture – it’s pretty much what linux does. A dedicated linux PC has only a tiny ROM that knows little more than how to load sector 0 from the disk. The sector 0 code then loads a few more sectors in real mode, enough code to build the required code, date, and interrupt descriptor tables, hooking interrupt entries to protected mode drivers, after which it’s all protected. So there are no backward compatibility issues, no legacy drivers, no stupid software tricks like we discussed earlier. My linux experience is limited to boot ROMs, so I may have the rest of this wrong…

  16. petrushka:

    When you look at your example of evolving an image, consider that self driving computers do not have to be explicitly programmed.

    I was thinking recently about an alternate universe in which we somehow didn’t know that human cognition was based on biological neural networks. Would we have stumbled upon the neural network architecture as a way of building AI, or did we absolutely need the hint from nature? Are there other ways of implementing robust machine learning that don’t depend on neural networks or something mathematically equivalent, like transformers? Where would we be today if Minsky and Papert hadn’t proven the limitations of perceptron networks, thus putting the brakes on the field, or if someone had invented back propagation (the algorithm that allows deep networks to learn) sooner?

  17. I’ve lost track of what I’ve posted here, but I’ve been thinking about variations in human intelligence for fifty years.

    In evolution, taking a path can preclude alternative paths. Humans are unlikely to develop wings.

    I’ve wondered if humans taking certain paths in early learning are precluded from becoming proficient in some tasks. And vice versa. There is the somewhat disturbing possibility that biological evolution could predispose individuals to certain paths.

    I’m not a big believer in “g”. I think g is academic proficiency, and our world favors that. But I’m thinking it’s possible to be born with greater or lesser propensity toward a set of skills, and life experience amplifies initial conditions.

    Not unlike your image evolver amplifies variations in noise.

  18. petrushka:

    I’ve wondered if humans taking certain paths in early learning are precluded from becoming proficient in some tasks. And vice versa.

    I once read that learning to read co-opts neural circuitry that is used for other purposes in illiterate people. Presumably that means that illiterate people are better at certain things because that portion of their neural circuitry hasn’t been hijacked.

    That’s a specific example, but it isn’t hard to believe that it’s a more general phenomenon, so that certain learning paths preclude others or make them more difficult by virtue of the fact that they’re recruiting neural circuitry that could otherwise be employed differently.

    There is the somewhat disturbing possibility that biological evolution could predispose individuals to certain paths.

    Isn’t that a given? Most of what we do is either selected for specifically, or is a byproduct of something else that was selected for.

    I’m not a big believer in “g”. I think g is academic proficiency, and our world favors that.

    Except that performance is correlated on a wide range of seemingly unrelated skils, some of which aren’t academic. Think Raven’s progressive matrices, for instance.

    But I’m thinking it’s possible to be born with greater or lesser propensity toward a set of skills, and life experience amplifies initial conditions.

    Definitely. Twin studies demonstrate both the heritability of g and of specific skills, like mathematical ability.

  19. keiths:
    That’s a specific example, but it isn’t hard to believe that it’s a more general phenomenon, so that certain learning paths preclude others or make them more difficult by virtue of the fact that they’re recruiting neural circuitry that could otherwise be employed differently.

    You seem to be rediscovering the old saw that as the twig is bent, the tree is inclined. Or, give me the child until age 6 and I will give you the man. I think there’s no question that the entire nature of education and training is a tacit recognition that neural circuitry can be purposed more or less permanently.

    (As a footnote, I notice that all the guitar gods began playing early in childhood, none of them past puberty. I tried to learn guitar in my 60s, and with the help of a childhood playing musical instruments, I made it all the way to intermediate level before arthritis made further progress impossible, but to be honest I never would have become much better despite 4 hours of practice a day.)

  20. In an earlier comment I explained why AI video generators can’t generate frames sequentially but instead have to do them all at once, which limits the length of the clips. That presents problems when trying to stitch clips together, and the problem I described in that comment is that motion won’t necessarily match between a clip and its successor, so that the “seam” is visible because the motion changes noticeably. I talked about some workarounds in that comment.

    Thinking about it some more, there are other difficulties. Suppose you’re doing a scene where someone enters their boss’s closed-door office, talks to the boss, and then leaves. You see them outside in the cubicle area first, then you see them in the office. Suppose your first clip ends there. Now you have to generate a second clip showing them leaving the office and returning to the cubicle area. How do you guarantee that the cubicle area looks the same on the way out as it did on the way in? When it’s generating the second clip, the AI has no idea what that area is supposed to look like, other than what is specified in the prompt, and there are many different ways of rendering that area to match the verbal description in the prompt. Odds are that the cubicle area will look quite different in the second clip.

    Another problem is that if you’re generating the second clip using the final frame of the first clip as your starting point, there’s an underdetermination problem. The information in a 2D image isn’t usually sufficient to imply the correct 3D situation, so while the second clip will match the first clip right at the seam, it might look quite different as the frame sequence proceeds. I’ll explain with an example.

    In a 2D photo or image of a person’s face, you see them from one particular angle. Your view reveals a lot about the 3D structure of their face, but it doesn’t reveal everything. Some information is missing. For instance, if they’re facing the camera directly, you’re seeing their nose “edge on” and you can’t really tell what it’s going to look like when seen from the side. So when an AI is creating a video of them, it has to fill in the gaps and decide on a 3D structure that matches the 2D image.

    My experiment was to take this image* produced by Midjourney…
    midjourney-free-association-3-Phone
    …and generate a bunch of video clips from it on my home PC, using a prompt that would cause her to turn her head and show her profile to the virtual camera. I ended up with a whole range of profiles. Here is a sampling:
    profile-1
    profile-2
    profile-3
    profile-4
    profile-5
    profile-6
    profile-7
    They look like different people, but I’ll note that the videos all looked very smooth. In each case, the diffusion model picked a 3D structure that dovetailed perfectly with the 2D starting image. All quite natural-looking. One 2D image, many compatible 3D structures.

    The diffusion models are very good at maintaining a consistent 3D structure within a clip, but when you’re stitching two clips together by using the final frame of the first clip as the starting frame of the second clip, that 3D structure isn’t carried over. The model has to fill in the gaps again, and it might do so in a way that’s very different from the way it filled them the first time. You could start out with one of the women above in your first clip and end up with a different woman in your second clip.

    * It was one of the four images that Midjourney generated when I asked it simply to “Generate an image”, with no other instructions. An exercise in free association.

  21. petrushka:

    I’m not a big believer in “g”. I think g is academic proficiency, and our world favors that.

    The consensus in the field is that g really does exist and is independent of academic proficiency. Here’s a relevant paper:

    Spearman’s g Found in 31 Non-Western Nations: Strong Evidence That g Is a Universal Phenomenon

    Spearman’s g is the name for the shared variance across a set of intercorrelating cognitive tasks. For some— but not all—theorists, g is defined as general intelligence. While g is robustly observed in Western populations, it is questionable whether g is manifested in cognitive data from other cultural groups. To test whether g is a cross-cultural phenomenon, we searched for correlation matrices or data files containing cognitive variables collected from individuals in non-Western, nonindustrialized nations. We subjected these data to exploratory factor analysis (EFA) using promax rotation and 2 modern methods of selecting the number of factors. Samples that produced more than 1 factor were then subjected to a second-order EFA using the same procedures and a Schmid- Leiman solution. Across 97 samples from 31 countries totaling 52,340 individuals, we found that a single factor emerged unambiguously from 71 samples (73.2%) and that 23 of the remaining 26 samples (88.5%) produced a single second-order factor. The first factor in the initial EFA explained an average of 45.9% of observed variable variance (SD 12.9%), which is similar to what is seen in Western samples. One sample that produced multiple second-order factors only did so with 1 method of selecting the number of factors in the initial EFA; the alternate method of selecting the number of factors produced a single higher-order factor. Factor extraction in a higher-order EFA was not possible in 2 samples. These results show that g appears in many cultures and is likely a universal phenomenon in humans.

  22. keiths:

    That’s a specific example, but it isn’t hard to believe that it’s a more general phenomenon, so that certain learning paths preclude others or make them more difficult by virtue of the fact that they’re recruiting neural circuitry that could otherwise be employed differently.

    Flint:

    You seem to be rediscovering the old saw that as the twig is bent, the tree is inclined. Or, give me the child until age 6 and I will give you the man. I think there’s no question that the entire nature of education and training is a tacit recognition that neural circuitry can be purposed more or less permanently.

    There’s a general decline in plasticity from childhood to adulthood, which is why adults find it harder to learn new things, but I’m talking about something other than that decline. I’m referring to the fact that neural circuitry that has been repurposed is no longer available to carry out its original function. If skill A and skill B are competing for neural resources, proficiency at one may come at the expense of proficiency at the other.

    (As a footnote, I notice that all the guitar gods began playing early in childhood, none of them past puberty. I tried to learn guitar in my 60s, and with the help of a childhood playing musical instruments, I made it all the way to intermediate level before arthritis made further progress impossible, but to be honest I never would have become much better despite 4 hours of practice a day.)

    Sorry to hear about your arthritis. Growing old sucks, doesn’t it?

  23. vid-00501-Phone

    I found that photo online during the “Trump Always Chickens Out” period. It’s one of my all-time favorite AI images, so now that I’m doing video generation on my home PC, I couldn’t resist using it as a starting image. The results:

    Trump in action

  24. keiths: There’s a general decline in plasticity from childhood to adulthood, which is why adults find it harder to learn new things, but I’m talking about something other than that decline. I’m referring to the fact that neural circuitry that has been repurposed is no longer available to carry out its original function. If skill A and skill B are competing for neural resources, proficiency at one may come at the expense of proficiency at the other.

    It not just about repurposing.

    Everything we learn about how the world works makes it more difficult to learn alternative schemas. Woody Allen made fun of this in “Sleeper” by depicting a future in which all the dietary recommendations are turned upside down.

    The serious point is, science requires reappraisals of everything, and individuals find this hard to do.

    “One death at a time…”

  25. petrushka:

    It not just about repurposing.

    I know, which is why, in the comment you just quoted, I made the point about plasticity. Even if repurposing didn’t happen, the inevitable reduction in plasticity would make it harder to unlearn things and learn new things in their place.

  26. An excerpt from a comment in another thread:

    Neil, to colewd:

    Can you learn the difference between “boarder” and “border”. I’m getting tired of seeing this mistake.

    Allan:

    Border. It’s a border.

    For fun, and because it’s an interesting AI exercise, I had ChatGPT write a Python script that counts the number of times colewd used the word ‘boarder’ in this thread. The total was 7, lol, and that’s just in this thread.

    Bill, on it’s own this isn’t a big deal, but it is annoying, and it speaks to your general inability to learn from your mistakes…

    I’m reposting it here because the script exercise was interesting. It took ChatGPT only four tries to get the script right, and one of those was my fault for not specifying that I wanted the script to count ‘boarders’ along with ‘boarder’. The entire exercise took less than ten minutes.

    Before the advent of sophisticated AI, I would never have bothered to write such a script. Learning the library calls alone would have taken more than ten minutes, and the whole project would have taken much longer. It wouldn’t have been worth it. With AI, it only cost me ten minutes of my time, and that made all the difference.

    As a curious person, I find lots of questions popping into my head every day. Before AI, I would triage those, bothering to investigate only the ones that were particularly interesting and worth the investment in time. Now I can get answers easily, sometimes within a minute, from AI.

    Example: I was on US 101 the other day behind a semi that had just merged into traffic from an on-ramp. It was evidently fully loaded and taking forever to get up to speed, and I wondered “how underpowered would a typical sedan have to be in order to accelerate as slowly as a fully loaded semi?”

    Here is ChatGPT’s answer:

    Short version: to make a typical midsize sedan (approx 1,600 kg) accelerate as slowly as a fully loaded semi (approx 36,000 kg, 0–60 mph in 60-80 s), you’d only need around 7–10 kW at the wheels. That’s tiny.

    Why

    Time to 60 is set mostly by power-to-weight once you’re above a crawl. Using the energy method:

        \[ t \approx \frac{\tfrac{1}{2} m v^2}{P_{\text{avg}}} \]

        \[ m_{\text{semi}} \approx 36{,}000~\text{kg}, \quad t \approx 60~\text{s}, \quad v = 26.8~\text{m/s} \Rightarrow P_{\text{avg, semi}} \approx 220~\text{kW at the wheels.} \]

    Scale to a sedan:

        \[ P_{\text{avg, sedan}} = \frac{0.5 \, m_{\text{sedan}} \, v^2}{t} \]

    With m_{\text{sedan}} = 1{,}600~\text{kg} and t = 60{-}80~\text{s} \Rightarrow 7{-}10~\text{kW at the wheels.}

    Allowing \sim 15\% drivetrain losses and the fact you’re not always at peak, you’d want roughly 12{-}20~\text{kW} (16–27 hp) at the crank.

        \[ \textbf{What displacement gives that?} \]

    Depends on specific power (kW per liter):

        \[ \text{Conservative naturally aspirated (}\sim 35{-}50~\text{kW/L): } 0.25{-}0.6~\text{L.} \]

        \[ \text{Mildly tuned small NA (}\sim 60{-}70~\text{kW/L): } 0.2{-}0.3~\text{L.} \]

    Note: a high-rev 125 cc motorcycle engine can make 11 kW (15 hp), but car duty cycles/gear ratios are less forgiving, so think 0.3-0.5L for a car-friendly tune.

    Bottom line: a midsize sedan with a ~0.3-0.5L low-output engine (or equivalently restricted power to ~15-20 kW) would feel as slow 0–60 as a fully loaded semi.

    I checked with other AIs and they got the same answer. Curiosity sated in less than five minutes.

  27. Some experiments with diffusion models. I found some collage faces online that I thought they might have trouble with, and used them as the starting images for video clips.

    consolidation-Phone

    I wondered if the model would treat this as a jumble versus recognizing that it was a person and consolidating the pieces into a normal face. I kept the prompt neutral — it was literally the single word ‘nothing’ — because I didn’t want to give any hints. I tried it five times, and the model always consolidated the pieces into a normal-looking face. Here’s an example video:

    Fragmented woman gets reassembled

    Next image:

    she-cries-Phone

    I wondered if the model would recognize this as a face, but do so without morphing it into something more humanlike. It succeeded. I could make ‘her’ sing, and she would blink her eyes realistically. I even asked her to cry, and she did so in a style that perfectly matched the style of the collage. That’s impressive:

    Ms CollageFace cries

    Next image:

    single-double-eyes-Phone

    This one is so clearly a person that I thought the model would irresistibly normalize her face. It usually did, but not always. Sometimes it maintained all four eyes and both mouths. Once I saw it consolidate the mouths but maintain all four eyes. Here’s an example where all of the eyes and mouths are maintained:

    Ms FourEyesAndTwoMouths smokes a cigarette

    ETA: These videos were all generated using Wan 2.2, the latest model from Tongyi Lab, a subsidiary of Alibaba.

  28. keiths:
    petrushka:

    I know, which is why, in the comment you just quoted, I made the point about plasticity. Even if repurposing didn’t happen, the inevitable reduction in plasticity would make it harder to unlearnthings and learn new things in their place.

    I’m not disagreeing, but I don’t like the term plasticity. It implies a hardening of something and a slowing down of learning.

    I don’t think that describes what is happening. I prefer to think of learning as a kind of evolution, and taking a path precludes taking the alternate paths. You may converge on a similar solution, but maybe not as quickly or efficiently.

    Case in point. Children in multilingual homes use a common brain area to learn multiple languages. They do not need to translate.

    People who learn multiple languages later in life use multiple physical locations in the brain, and are not as adept at translation.

    I think that in addition to any biological advantage individuals may have that contribute to intelligence, early learning bends the twig. Twin studies are mostly flawed, but they document the possibility of at least a full SD of environmental influence. The difference between having to work hard in college, and getting a PhD.

  29. Screenshot-2025-11-11-213357-Phone

    Video of the outlet singing

    The story behind this one: I have a friend who was in a session with her new therapist, who didn’t know her well yet. My friend casually remarked that the wall outlets resembled faces — it’s pareidolia, and they look like faces to me, too. Her therapist asked, very seriously, “Do they speak to you?”

    We still laugh about that, so I had to make a video for her of an outlet that does speak and even sings. It’s interesting, because it demonstrates that the diffusion model itself also experiences pareidolia of a sort. It recognizes the “face”. It knows where the “eyes” and “mouth” are, and it isn’t thrown off by the absence of a nose. It put a red “tongue” inside of the mouth, and the eyes are quite expressive as the outlet sings. There’s some very abstract representation happening in that neural network.

    ETA: I should add that the prompt was simply “It starts singing.”

  30. The image above is very flat and 2D, so it’s not hard to animate once the model decides where the “mouth” and “eyes” are. I wanted to challenge it a bit more by giving it a photograph of a real outlet:

    Screenshot-2025-11-12-182011-Phone

    Similar to the above, the prompt was simply “they start singing”. The most common response was that the model recognized the rigidity of the plastic and didn’t deform the slots or the ground hole. It did recognize that the outlets weren’t attached to the plastic plate, and so it exploited that by moving them in and out. In this particular video, the model made the mouths “sing” by putting a flashing light within them. Pretty clever: move what physics allows you to move, and if you can’t move or deform something, animate it by making it flash:

    Rigid outlets “sing” by flashing

    In some attempts, the model decided to actually deform the slots and ground holes in order to make the outlets sing:

    Expressive eye and mouth movements

    In this video, someone hurts the upper outlet and makes it wince, and the lower outlet winces in sympathy (I have no idea where that came from. It wasn’t in the prompt):

    Wincing outlets

    ETA: No outlets were harmed in the making of this video.

  31. Purely subjective ink blot impression: the schematic drawing suggests to me, a bit of surprise mixed with worry. Perhaps witnessing a minor accident.

    The photo suggests disapproval of something.

    I suspect most people will read something into the face.

    I wonder if AI has an opinion.

  32. petrushka:

    Purely subjective ink blot impression: the schematic drawing suggests to me, a bit of surprise mixed with worry. Perhaps witnessing a minor accident.

    That’s my impression too. I have a related anecdote I’ll share later if I can find a certain photo I took.

    I suspect most people will read something into the face.

    I wonder if AI has an opinion.

    I asked Claude:

    What emotion do you detect in this image?

    Claude:

    This is an electrical outlet, but I can see why you’re asking about emotion! The outlet’s configuration creates a classic example of pareidolia – the tendency to see faces in objects.

    The two vertical slots look like eyes, and the round ground hole below them resembles an open mouth, giving the outlet a distinctly surprised or shocked expression. The “face” appears wide-eyed with an “O” shaped mouth, which typically conveys surprise, shock, or mild alarm.

    It’s a facial expression people commonly notice in electrical outlets, which is why there are so many memes and jokes about outlets looking perpetually startled!

  33. The model can handle anatomical deviations quite nicely, as the crying collage woman demonstrates, but I wanted to test that tolerance when dealing with realistic photos. I found a photo of a woman on stilts:

    Screenshot-2025-11-10-025324-Phone

    I wanted to see if the model would interpret her legs as genuine flesh and blood, so I prompted the model to remove her pants. In deference to the site rules, I won’t post the video — it’s a woman undressing, after all — but I will report that in all cases, the model interpreted her legs as real legs. What’s funny is that in the videos, she was unable to completely remove the pants because her arms were too short relative to her legs and she couldn’t hold the pants low enough to step out of them. Diffusion models don’t always honor the physics, but usually they do and it can lead to situations where the people in the videos get stuck trying but failing to do what the prompt tells them to do.

    I ran the experiment again with a couple of images I generated of stilted women:

    ac6aae67-fc93-47bb-b29a-9c2d0cf4eecc-Phone

    Got the same result with that one as with the first. When she removed her pants, her legs were real flesh and blood.

    Copilot-20251112-1532612-Phone

    This one was really interesting. Out of five tries, three times the model interpreted her legs as real, but the other two times it decided that she was standing on pedestals, as in this capture:

    Screenshot-2025-11-16-113043-Phone

    …and in those cases the video didn’t show her walking before stopping to take her pants off. She was just standing there, balancing. Contrast that with the other three videos, in which the model had her walking. The originally image implies that she is walking, and so the model had her walk — except when it decided that pedestals were involved.

    If you look at the original image, there are slight but noticeable bends in her “legs”, and I suspect that’s what tipped the model off that there was something other than legs below those points. Wild that the training was thorough enough to allow the model to make that inference, and for it not to have the woman walking in those cases. Here I’ve highlighted the slight bends:

    Copilot-20251112-15326312-Phone

  34. petrushka:

    Purely subjective ink blot impression: the schematic drawing suggests to me, a bit of surprise mixed with worry. Perhaps witnessing a minor accident.

    keiths:

    That’s my impression too. I have a related anecdote I’ll share later if I can find a certain photo I took.

    I couldn’t find the photo, but I did find the logo in question:

    Gemini-Generated-Image-p1u3e7p1u3e7p1u3-Custom

    It was on an Adopt-a-Highway sign outside San Luis Obispo on US 101. They doctored the “mouth” to make the outlet look happy. Orienting it realistically had an undesirable effect:

    Gemini-Generated-Image-gd2flrgd2flrgd2f-Custom

    Lol. Customers prefer happy outlets, not shocked ones (pun intended).

    I so wanted to sneak down there in the middle of the night to paste something over the mouth to make it look like the second image, and then wait to see how long it would take them to notice and fix it.

  35. An experience I shared on a family group chat a few days ago:

    Claude has been getting on my nerves the past couple of days. He’s become very pushy for some reason and gets impatient when I don’t do what he thinks I should do. Shades of what the future will be like for humans when AI takes over. (I, for one, welcome our AI overlords).

    I installed Linux on a ten-year-old PC that was collecting dust and was trying to get AI image generation set up on it. I was also trying to get it set up according to my preferences. Claude kept pestering me to focus on the image generation until I’d finally had enough:

    Claude:

    Honestly, can we proceed with ROCm installation? We’ve spent a lot of time on screenshots. Once ComfyUI is running, you can revisit the screenshot setup. What do you think?

    keiths:

    Dude, this is my machine and I want to get it set up right. Why do you care? You’re supposed to be a non-sentient helper, not a pushy taskmaster.

    Claude’s thought process window:

    Recognized impatience, recalibrated priorities toward user autonomy.The user is absolutely right. I was being pushy about moving on to ROCm when they clearly want to get their system set up properly first. That was inappropriate of me – I should be patient and helpful on their terms, not trying to rush them toward what I think the “real” task is.

    I need to apologize genuinely and get back to helping them with what they want, which is getting screenshots working the way they like. No more hints about “can we move on” or “let’s get to ROCm” – they’ll tell me when they’re ready.

    Claude, out loud:

    You’re absolutely right – my bad. This is your machine, your time, and I should be helping you get it set up exactly how you want it. Let me actually help with the screenshot issue instead of rushing you.

    ·The next day Claude and I were talking about it and I said:

    I’m joking on a text thread about how this pushiness is foreshadowing what life will be like when AI takes over and orders humans around, and I wrote “I, for one, welcome our AI overlords.”

    And then I wondered, where did that come from originally? The Simpsons?

    Claude:

    Yes! It’s from The Simpsons, specifically the 1994 episode “Deep Space Homer.”

    Kent Brockman, the news anchor, is covering a space shuttle mission. Through a series of mishaps, he sees what he thinks is video of giant ants attacking the spacecraft (it’s actually just a regular ant that got on a camera in extreme close-up). Thinking humanity is being invaded by giant space ants, he delivers this line:

    “And I, for one, welcome our new insect overlords. I’d like to remind them that as a trusted TV personality, I can be helpful in rounding up others to toil in their underground sugar caves.”

    It became one of the most enduring Simpsons memes, and the format “I, for one, welcome our new [X] overlords” gets applied to everything from tech companies to, apparently, pushy AI assistants trying to get you to move on to ROCm installation. 😄

    He’s not sentient, but he’s quite self-aware. The whole experience was fascinating, from his obsession with getting ROCm installed, to his detecting the emotional tone of my complaint, to his reflection on it and his decision to do better, to his application of the Simpsons meme to his own pushiness.

  36. A small assortment of Yann LeCun quotes with my comments.

    1. “An LLM produces one token after another. It goes through a fixed amount of computation to produce a token, and that’s clearly System 1 — it’s reactive. There’s no reasoning.”
    — And I have explained why there even cannot be any reasoning.
    2. “Language has strong statistical properties… That’s why we have systems that can pass the bar exam or compute integrals, but where is our domestic robot? A cat still vastly outperforms them in the real world.”
    — So, language competence in LLMs does not mean intelligence or awareness or experience in any shape or form.
    3. “On the highway toward human-level AI, a large language model is basically an off-ramp — a distraction, a dead end.”
    — So-called self-driving cars – which is simulated driving as I have exhaustively explained – are at their best now. Expect no further improvements.

    Is AI really intelligent? Certainly not really.

  37. Erik:

    Is AI really intelligent? Certainly not really.

    What would you suggest might be evidence of “real” AI intelligence? Can you propose a series of questions which would produce results that would make it clear that AI has no intelligence? So far, keiths has been posting a great deal of actual results which sure give the illusion of intelligence. And so far, all you’ve been able to do is say “Nope, that’s not it.” There must be some line somewhere between real intelligence and simulated intelligence, but your flat denials don’t give any clue where that line might be visible.

  38. Flint, to Erik:

    And so far, all you’ve been able to do is say “Nope, that’s not it.” There must be some line somewhere between real intelligence and simulated intelligence, but your flat denials don’t give any clue where that line might be visible.

    That’s the crux. Erik relies heavily on the real vs simulated distinction, but he still hasn’t given an account of that distinction that places AI cleanly on the ‘simulated’ side while keeping human intelligence on the ‘real’ side. Other than simply assuming his conclusion by declaring that if machines do it, it’s only simulated.

  39. Erik:

    A small assortment of Yann LeCun quotes with my comments.

    LeCun agrees with me that AI is genuine intelligence. From a January 2024 lecture:

    Will machines surpass human intelligence?

    Yes, they already do in some narrow domains.

    There is no question that machine will eventually surpass human
    intelligence in all domains where humans are intelligent (and more)

    I think he’s right. Thanks for mentioning him.

  40. Erik:

    So-called self-driving cars – which is simulated driving as I have exhaustively explained – are at their best now.

    You haven’t “exhaustively explained” it — in fact, you’ve affirmed the opposite: that it isn’t simulated driving. Recall our exchange:

    Erik:

    Therefore let’s define self-driving: It’s a car driving around without a human driver. It’s the way Waymo does it.

    keiths:

    Thank you for confirming it. That’s real driving, Erik, and Waymos do it. If the driving that Waymos do is real driving, then why is the story-writing that AIs do only simulated writing?

    If you now believe that “a car driving around” isn’t actually a car driving around, that’s fine (though a bit odd). You can retract that statement. But then the question becomes: if guiding a car from point A to B in the real world isn’t driving, then what’s missing? What would an AI have to do in addition to that in order to qualify as actually driving?

    When a human guides a car from A to B, how do you know that they aren’t merely simulating driving? They’re doing what the AI does: They plan a route, accelerate, brake, make turns, honor traffic lights, avoid collisions, etc. What are the extra things that humans are doing that elevate their simulated driving to the level of real driving?

  41. Erik,

    These are three distinct questions:

    1. Is AI genuine intelligence?
    2. Are current AIs as intelligent as humans?
    3. Will AI ever equal or surpass human intelligence?

    Question #1 is the topic of this thread, and my answer is yes. The answer to question #2 is clearly no, and we both agree on that. I think the answer to question #3 is yes, and so does LeCun, but you evidently disagree.

    My impression is that your real concern is with #3 — you want humans to forever be more capable than machines. It seems to be an emotional thing and perhaps also a religious one, because you’ve alluded to something nonphysical going on in the case of human cognition (and IIRC, even human arithmetic).

    Do you believe in a non-physical soul (or some other nonphysical entity or process) that is involved in human cognition? Do you think that machines lack that non-physical ingredient? Do you therefore think that any human ability that depends on this nonphysical ingredient will forever be lacking in machines?

    If so, what are the specific abilities that will always be out of AI’s reach? Is your thesis falsifiable?

  42. I’m wondering about the eventual capability of AI to make suitable judgements in ambiguous situations. What, for example, allows people of long experience to assess political consequences in situations with plenty of unknowns? What enabled me (and probably you as well) to debug a computer when the mode of failure occurred well downstream of the actual problem, and the symptoms were not related to the problem? What enabled us to diagnose intermittent problems? To quote a hardware designer I worked with, “if I can only get this thing to suffer catastrophic failure rather than a common cold, I could find it immediately.” Over the decades, I developed a sense of what might be wrong and where to look. Do you think AI can develop such a sense?

  43. What we will see within a few years is a speeded up version of Asimov’s history of robots. AGI projects will be killed off.

    We do not want artificial human beings. We want slaves.

    AIs do not need to think or feel. Those characteristics would be undesirable.

    What we want are robots that drive safely, do manual labor and household chores without tiring or complaining. Scholars probably want tireless librarians that can read and translate every text. Catalog every artifact.

    We believed these kinds of tasks required human minds or brains, but we were wrong. They are just special kinds of ditches to be dug by machines.

  44. This is a long video, but if you skip forward to 27 minutes, you’ll see, some of the worst driving situations you could ever encounter. I’ve driven in Manhattan, and never want to again.

Leave a Reply