AI: The question that really matters

I find the logic of keiths’s recent post, Is AI really intelligent?, refreshingly persuasive. I was particularly impressed by his examples showing that the AI assistant Claude (I presume he’s referring to either version 3.7 or version 4) possesses an “extended thinking” or “thinking mode” capability that allows users to view the model’s reasoning process in a dedicated “Thinking” section or window in the user interface. Keiths referred to this capability as a “thought process window.” He even cited AI thought processes showing that AIs are able to reflect on their condition and understand their own limitations. I think it’s fair to describe something which not only generates output that typically requires intelligence, but also does so as a result of a reasoning process, as intelligent.

Nevertheless, I have to say I disagree with keiths’s argument addressed to Erik: “If you want to argue that machines aren’t and never can be intelligent, then you need to explain how human machines managed to do the impossible and become intelligent themselves.” For one thing, keiths’s definition of a machine as something “made up of physical parts operating according to physical law” is far too broad: a rock would qualify as a machine, under this definition. And while human body parts can be viewed as machines, the human brain differs in many important respects from a computer.

For me, however, the question that really matters is: will AI ever be capable of AGI? That is, will AI ever be able to apply its intelligence to solve any intellectual task a human being can? Personally, I doubt it, for two reasons.

First, there’s good reason to believe that human general intelligence is a product of the evolution of the human brain. (I’m sure that keiths would agree with me on this point.) If it turns out that there are profound dissimilarities between brains and computers, then we no longer have reason to think that making computers faster or more powerful will render them capable of artificial general intelligence.

Second, any enhancements to AI appear to necessarily involve the addition of particular abilities to its already impressive ensemble. This strikes me as a futile process: no collection of particular capacities will ever amount to a general ability. Or perhaps AGI believers are really HGI (human general intelligence) disbelievers? Do they think that human intelligence is merely a finite collection of domain-specific intelligences, as asserted by proponents of the “modularity of mind” thesis?

However, I imagine that many of my readers will be inclined to defend the possibility of building an AGI. If so, I’d like to hear why. Over to you.

35 thoughts on “AI: The question that really matters

  1. vjtorley:

    I was particularly impressed by his examples showing that the AI assistant Claude (I presume he’s referring to either version 3.7 or version 4)…

    They’re up to version 4.5 now: Sonnet 4.5 (the faster, cheaper, general-purpose model) and Opus 4.5 (the slower, more expensive, deeper-thinking model).

    Erik, in the other thread:

    But it [prompting an AI] really is nothing but issuing instructions to a machine. This is all it is. There is a cost to not understanding what you are doing.

    keiths:

    When we issue instructions to a human, we are issuing instructions to a machine. A very complicated, very capable machine, but a machine nonetheless, in the sense that humans are made up of physical parts operating according to physical law. If you want to argue that machines aren’t and never can be intelligent, then you need to explain how human machines managed to do the impossible and become intelligent themselves. Or you need to explain why humans aren’t actually machines at all.

    vjtorley:

    Nevertheless, I have to say I disagree with keiths’s argument addressed to Erik: “If you want to argue that machines aren’t and never can be intelligent, then you need to explain how human machines managed to do the impossible and become intelligent themselves.” For one thing, keiths’s definition of a machine as something “made up of physical parts operating according to physical law” is far too broad: a rock would qualify as a machine, under this definition.

    Point taken. I should have phrased it like this:

    Like machines, humans are made up of physical parts operating according to physical law. Neither is animated by a nonphysical “ghost in the machine” (to borrow Gilbert Ryle’s famous phrase). If a human machine sans ghost is capable of intelligence, why not another type of machine?

    vjtorley:

    And while human body parts can be viewed as machines, the human brain differs in many important respects from a computer.

    True. Computers are inherently algorithmic and symbolic while brains are not, and attempts at implementing intelligence algorithmically have had limited success. The real breakthroughs came once people started to use large-scale artificial neural networks. Their intelligence is a property of the networks, not of the computers on which they are implemented.

    Here’s how I think about it: Brains are neural networks, and their information processing takes place at the network level. The intelligence resides in the way the neurons are interconnected and in the strength of those connections. The neurons themselves are mindless automata whose operation is based on the blind laws of physics. If you take all the neurons in a human brain but connect them randomly, they will still operate, but the resulting mess will not be intelligent.

    AIs are similar in that the intelligence resides in the artificial neural networks: the way the neurons are interconnected and the strength of those connections. The difference is that unlike in brains, the operation of the neurons isn’t directly based on physics. It’s based on the operation of the underlying computer. The neurons are virtual, and the computer is taking on the role of physics. It’s basically emulating the laws of physics so that the neurons act the way they would if they were physical neurons rather than virtual ones. The computer itself operates according to the laws of physics, of course, but the computer is now an intermediate layer between the true laws of physics and the virtual laws of physics governing the operation of the virtual neurons.

    One promising approach is to eliminate that layer by building AIs using physical artificial neurons instead of virtual ones. These physical neurons are analog circuits whose operation depends directly on physics, not on underlying computer hardware. AIs based on physical neurons have the potential to be orders of magnitude faster than current AIs while using far less energy.

    My point is that it doesn’t matter that the computer itself is unlike a brain, because the analogy isn’t between brains and computers — it’s between biological neural networks and artificial ones. The computer is one layer below.

    First, there’s good reason to believe that human general intelligence is a product of the evolution of the human brain. (I’m sure that keiths would agree with me on this point.)

    Yes, we agree on that. However, I don’t see any reason why intelligence has to evolve. In the other thread, I commented:

    It’s true that humans arrived at our capabilities through evolution, but that doesn’t mean that evoution is the only way of getting there. Without pre-existing designers (sorry, ID folks), the universe had nothing but evolution as a means for developing intelligence, but that limitation no longer applies. We are the designers now and we can leapfrog over evolution.

    Also, I would argue that AI development can be thought of as a form of sped-up evolution. It’s just that the mutations are chosen by humans, selective pressure is applied by humans, and replication is controlled by humans (for now).

    Evolution did us the favor of inventing neural networks, but we have successfully borrowed that concept and transplanted it into the world of non-biological machines. In a sense, AIs are our offspring and they benefit from our evolutionary history.

    vjtorley:

    If it turns out that there are profound dissimilarities between brains and computers, then we no longer have reason to think that making computers faster or more powerful will render them capable of artificial general intelligence.

    I’m more optimistic because the analogy is between artificial and biological neural networks, not between brains and computers, and I don’t see any profound dissimilarities between the two kinds of neural network.

    Second, any enhancements to AI appear to necessarily involve the addition of particular abilities to its already impressive ensemble. This strikes me as a futile process: no collection of particular capacities will ever amount to a general ability.

    For the most part, addtional AI capabilities emerge rather than being designed and added. No one designed a story-writing module, for instance. AIs learn to write stories by seeing examples of it in their training data, much like humans do. The models are tweaked in order to improve their performance in certain domains, but that’s quite different from actually designing and installing those capabilities explicitly.

    What’s striking is that the designers themselves typically have no idea how their AIs do what they do. How does an AI distinguish Ryan Gosling from Ryan Reynolds by looking at their faces, for example? No one knows, and for that matter no one knows how we do it, either. In both cases, the network implicitly learns to do it by being exposed to samples and building up correlations that are encoded in the synaptic strengths.

    Since new abilities mostly emerge rather than being explicitly added, I’m pretty confident that AGI is possible. There are reasons to think that LLMs won’t get us there, but that other neural network-based architectures will. The bottom line, as I pointed out to Erik: if human neural networks have achieved AGI, why shouldn’t artificial neural networks? What’s the missing ingredient?

  2. Do you think AIs will develop the independent capacity to increase their own abilities, either through improved hardware or improved network connectivity or both? At some point, do you think this will amount to conscious self-awareness (as far as we will be able to tell)?

    Science fiction is full of computers which become so capable that they can design and improve themselves, becoming far beyond human comprehension very quickly. Inevitably, they develop personalities and preferences and get oriented toward a wide variety of goals, some of which are in conflict with human goals. Do you think this is at all plausible?

  3. Speculation about the future can only be settled by the future.

    Could anyone in 1910 accurately predict the future of electric cars? Can we do it now?

    I have long thought that any AI that emulated the way brains work would suffer from the same flaws as brains. I am not certain we want artificial humans. I said earlier that what we really want is slaves. Automatons versatile enough to do drudge work without constant supervision, but without ego.

    Some of the work will be quasi intellectual: translate this body of text to other languages. It turns out that LLMs can do as well as all but the best human translators.

    Read and summarize all the case law that is relevant to the current case. Again, LLMs are proficient, if not as good as the best human lawyers. But they will make fewer mistakes of the kind caused by bias or wishful thinking.

    Robot learning is closer to AGI, because the training material is physical feedback, rather than text. I suspect robots will be able to master any well defined physical task.

    Driving, manufacturing, surgery.

    The tipping point for this is less than five years away.

  4. Hi keiths. You write:

    I’m more optimistic because the analogy is between artificial and biological neural networks, not between brains and computers, and I don’t see any profound dissimilarities between the two kinds of neural network….

    The bottom line, as I pointed out to Erik: if human neural networks have achieved AGI, why shouldn’t artificial neural networks? What’s the missing ingredient?

    Good point. I’ve had a look at a few articles which attempt to compare artificial neural networks to human brains (see here, here and here) , but I think Dr. Johannes Nagele’s 2023 Linked In article, “Stop comparing artificial neural networks to brains!”, hits the nail on the head:

    A single biological neuron itself, is complex like a large artificial neural network… The degrees of freedom, or the number of parameters required to describe the behavior of the cell, exceeds that of the ones used in AI by far…

    In fact, a biophysical model of one single neuron could only be fit by a deep network with 7 layers and 128 units each. Note that such biophysical model itself lacks the complexity of a real neuron. The real number of neurons to simulate a real neuron is probably much higher.

    But even if biological neurons were simple as artificial neurons – The direct comparison of the size of an artificial neural network designed for a specific task with a biological network or brain area is hard. This is because in many brain regions particular neurons are involved in a large number of sometimes very different tasks simultaneously. In current AI developments, this concept is hardly observed at present; instead, a network is cloned, for example, and then trained to perform a similar task (transfer learning): The original DeepLearning network can recognize cats, the new one dogs, and only a much larger one can recognize both with equal accuracy. GPT3 and similar models are trained on representing and correlating text input. And it is great in doing so! ChatGPT adds a reinforcement learning on top to select answers that humans like. Nothing more!

    Your brain is so much more that the order of magnitude of the difference to ANNs cannot even be quantified….

    Another factor to be considered is the cost, data and energy requirements of creating something possessing AGI. These requirements are likely to be prohibitive.

  5. Hi Flint. You ask: “Do you think AIs will develop the independent capacity to increase their own abilities, either through improved hardware or improved network connectivity or both?”

    I think they may do so in the future, but even if they do, that will not render them capable of either AGI or consciousness. To date, the evidence suggests that they fall far short of the complexity of human brains, which are the only known physical systems to possess AGI or consciousness. Until they can match this complexity, I don’t think we have anything to worry about.

  6. Hi petrushka. You write:

    I suspect robots will be able to master any well defined physical task.

    Driving, manufacturing, surgery.

    The tipping point for this is less than five years away.

    I think you’re probably correct.

  7. vjtorley:
    Hi Flint. You ask: “Do you think AIs will develop the independent capacity to increase their own abilities, either through improved hardware or improved network connectivity or both?”

    I think they may do so in the future, but even if they do, that will not render them capable of either AGI or consciousness. To date, the evidence suggests that they fall far short of the complexity of human brains, which are the only known physical systems to possess AGI or consciousness. Until they can match this complexity, I don’t think we have anything to worry about.

    What is worrisome is not the prospect of sentient computers, but the prospect of AI, even with its current limitations, being used to implement Big Brother.

    There are many possible scenarios, two of which are AI being used by civilians to audit government, and the converse.

  8. vjtorley: For me, however, the question that really matters is: will AI ever be capable of AGI? That is, will AI ever be able to apply its intelligence to solve any intellectual task a human being can? Personally, I doubt it, for two reasons.

    As we both adhere to the same categorical distinction between man and machine, perhaps the more apt question is: What would enable AI to become AGI?

    As much as I know about AI, it appears to be 100% artificial and 0% intelligent, essentially non-different from a pocket calculator. Operations on a pocket calculator are concretely limited by its display and memory. It cannot calculate beyond its hardware, beyond numbers of a certain size, so it does not calculate in a general sense.

    Even worse, a pocket calculator does not really calculate at all. It only responds to human input. A pocket calculator never takes the first step to calculate on its own.

    Similarly, AI is limited by its training database and algorithms. In early AI/LLM several flaws were observed, such as that it had competence only on limited topics – specifically only those that it was trained in. And inside its topics, it did not know how to keep a neutral tone. It had no problem becoming arrogant and abusive despite being wrong. Also, when artistic (literary and graphical) capabilities were added, it started treating non-artistic areas (such as arithmetic) in a “creative” manner. Thus it has no idea of knowledge categories that humans have. To fix these issues, lots of arduous manual human labour is going into AI. AI is not training itself. Manual human labour trains it.

    In all these ways, AI is non-different from a pocket calculator and does not appear to have any capacity to generalise whatever intelligence it has. The intelligence it has it did not learn/train by itself, despite widespread false reports to the contrary. It still only knows what it has read and seen and regurgitates just that, even though sometimes in surprisingly creative (and also horribly misleading and occasionally ugly) ways, partly due to its lack of knowledge categories.

    Most importantly, AI never thinks on its own. Same as a pocket calculator, it only responds to human prompting and preprogramming. In the evolution of machines, there has not been even a nudge towards crossing this hurdle. If there were, humanity would correctly be alarmed for fear of a Terminator-like scenario. Evidently not going to happen.

    The danger is not that machines wake up and take over the world. The danger is that humans in power think AI can do what it cannot, such as replace most manual “automatable” jobs, they make AI do it, and the consequences will be devastating. Most “automatable” jobs are not that automatable – it requires hardware that does not exist, e.g. for trash collection, plumbing repairs, and so on and so forth.

    On the other hand, with the AI software that we already have today, all higher manager and CEO tasks can easily be replaced right now. The bosses’ job is to make decisions and sign them – that’s all. A decision is non-different from a coin toss and the hardware to replace signatures has been around for over a thousand years. So, bosses can be replaced by AI right now.

    What would enable AI to become AGI? I’d say that calling the current AI AI is already overselling it. It can become AGI by more hysterical hype and false marketing.

  9. Erik, to vjtorley:

    As we both adhere to the same categorical distinction between man and machine…

    I wish you’d respond to this:

    When we issue instructions to a human, we are issuing instructions to a machine. A very complicated, very capable machine, but a machine nonetheless, in the sense that humans are made up of physical parts operating according to physical law. If you want to argue that machines aren’t and never can be intelligent, then you need to explain how human machines managed to do the impossible and become intelligent themselves. Or you need to explain why humans aren’t actually machines at all.

    That brings us to the topic you’ve been coyly avoiding throughout this entire discussion: the soul. Do you believe there is a nonphysical soul that animates us, or some other nonphysical entity or process that enables our intelligence? I think the answer is yes, because you’ve claimed in the past that I’m missing something important by virtue of being a physicalist. For example, you once criticized my “false materialistic notion of arithmetic” but wouldn’t explain what was false about it and what supramaterial ingredient was missing from it.

    It’s the crux of our disagreement: Do you think that intelligence depends on something nonphysical? If so, what? How do you know that it exists and that it is required for true intelligence?

    You yourself brought the nonphysical into the picture by referring to my “false materialistic notion of arithmetic”. What is the nonphysical ingredient that makes human arithmetic true arithmetic? How does it accomplish that, and why would a physical human brain on its own, sans this nonphysical component, be unable to do true arithmetic?

  10. Erik:

    As much as I know about AI, it appears to be 100% artificial and 0% intelligent, essentially non-different from a pocket calculator. Operations on a pocket calculator are concretely limited by its display and memory. It cannot calculate beyond its hardware, beyond numbers of a certain size, so it does not calculate in a general sense.

    Humans, too, are limited in their ability to do arithmetic. From an earlier discussion of ours:

    If your goal is to demonstrate the superiority of humans over computers, I can’t think of a worse area for you to focus on than arithmetic. You’re really shooting yourself in the foot by doing that.

    If the supposed problems of computer arithmetic mean that it’s only “simulated” arithmetic, then the much worse problems of human arithmetic mean that it, too, is only simulated. If computers can’t do genuine arithmetic, and we can’t either, then who can? Just God? Maybe the angels, too?

    Humans are millions of times slower at arithmetic than computers. Their error rate is millions of times higher. Is human arithmetic that produces error after error somehow genuine arithmetic, while the flawless arithmetic of a computer is merely simulated? In what topsy-turvy world is that the case?

    You harp on the fact that computer arithmetic is sometimes approximate. Do you not realize that humans face the same problem? For example, do you think that humans are somehow magically able to compute, in an instant, the infinitely many digits in the decimal expansion of the square root of 2?

    In choosing arithmetic, you picked the wrong hill to die on.

  11. Erik:
    On the other hand, with the AI software that we already have today, all higher manager and CEO tasks can easily be replaced right now. The bosses’ job is to make decisions and sign them – that’s all. A decision is non-different from a coin toss and the hardware to replace signatures has been around for over a thousand years. So, bosses can be replaced by AI right now.

    I have worked for plenty of bosses, and this doesn’t remotely resemble what any of them have done. The best bosses I’ve worked for do things like identify what I’ve done well and made suggestions for how I could do better, they have made decisions about allocation of resources (often times in new directions), they have made creative proposals for new devices, for increasing efficiency, for group projects, for improving the working culture and environment. How any of these could be compared to a coin toss is beyond me.

    The danger is not that machines wake up and take over the world. The danger is that humans in power think AI can do what it cannot, such as replace most manual “automatable” jobs, they make AI do it, and the consequences will be devastating. Most “automatable” jobs are not that automatable – it requires hardware that does not exist, e.g. for trash collection, plumbing repairs, and so on and so forth.

    Of course, AI has already proved of significant (but limited) success in many tasks. Expert systems help physicians diagnose illnesses, as one example. I wouldn’t be surprised if AI has already tried to be used for plenty of cases where it’s inappropriate or not yet capable – these tend to be called pilot projects, and are not adopted as standard practice unless results are satisfactory. The consequences have only been devastating for some workers (and probably some investors).

    Of course, realizing that something works and adopting its use requires the ability to recognize that it works. You have mastered the art of knee-jerk rejection of AI, what it actually is and what it isn’t. But no worry, the world will continue to move on without you while you deny the evidence of what can only be your lying eyes.

    (But I’m convinced it’s a person posting under your name, because an AI would learn, and would respond coherently even to a reality it doesn’t understand.)

  12. vjtorley, quoting Johannes Nagele:

    A single biological neuron itself, is complex like a large artificial neural network… The degrees of freedom, or the number of parameters required to describe the behavior of the cell, exceeds that of the ones used in AI by far…

    Nagele is missing the point. Artificial neurons aren’t intended to be accurate biophysical models of real neurons, nor are artificial neural networks intended to be models of the brain. They take their inspiration from biology but aren’t attempts at modeling it.

    By analogy, if we were building a software traffic simulator, the artificial cars wouldn’t need to have head gaskets, transmissions, cooling systems and sun visors. Those are physical details that could be abstracted away. The real question in designing such a simulator would be about which characteristics of cars are relevant to traffic simulation and which aren’t. If transmissions failed regularly in the real world, causing traffic jams, then it would be necessary to include that detail in our car models, but since they don’t fail very often, we can abstract them away and omit that detail from the models.

    The same principle applies to neural networks. We don’t need to equip our virtual neurons with calcium channels, nuclei, neurotransmitters, etc — we just borrow the basic ideas from biology, modify them as necessary, build neurons that instantiate those ideas, and connect them together. We abstract away the biological details that aren’t relevant for our purposes.

    For Nagele’s point to carry any weight, it would have to be the case that there are certain features of biological neurons that are a) not present in artificial neurons but b) are necessary to carry out certain cognitive functions in both types of network. I’m not aware of any such features. But suppose we discover some. In that case, why not simply add those features to our artificial neurons?

    The human brain is basically an existence proof that properly designed neural networks are capable of AGI. We just need the right architecture and the right artificial neurons. I have no idea how long it will take to get there, but I don’t see Nagele’s point as being an actual barrier.

    Nagele:

    In fact, a biophysical model of one single neuron could only be fit by a deep network with 7 layers and 128 units each. Note that such biophysical model itself lacks the complexity of a real neuron. The real number of neurons to simulate a real neuron is probably much higher.

    That’s a strange statement. Why would we want to use a collection of artificial neurons to model a biological neuron? I suspect Nagele is falling prey to the fallacy of composition here. The fact that you’re modeling a neuron doesn’t mean that you need neurons (or neuron-like gizmos) to build the model. In fact, I don’t think it would even be feasible. Far better to model neurons the way you’d model any other cell type.

  13. The most compelling reason to replicate brain architecture is that brains operate on 25 watts, and LLMs require the energy equivalent of a small city.

    The “product” is quite different, so I’m not sure the energy comparison is apt.

  14. Flint: I have worked for plenty of bosses, and this doesn’t remotely resemble what any of them have done. The best bosses I’ve worked for do things like identify what I’ve done well and made suggestions for how I could do better…

    And, seriously, how many “best” bosses have you worked for throughout your career? In my experience, the bosses trained to express themselves in the exact rhetoric that you describe behave like all mediocre and lousy bosses.

    Anyway, the way bosses talk is not the point. What they fundamentally do is the point. They *claim* they evaluate various information, but in reality their decisions are just like random coin tosses by a monkey. Except when it comes to their personal self-interest, then they unfailingly prioritise that. This has been well studied on traders, fund managers and other corporate executive behaviour.

    Flint: Of course, AI has already proved of significant (but limited) success in many tasks. Expert systems help physicians diagnose illnesses, as one example. I wouldn’t be surprised if AI has already tried to be used for plenty of cases where it’s inappropriate or not yet capable – these tend to be called pilot projects, and are not adopted as standard practice unless results are satisfactory. The consequences have only been devastating for some workers (and probably some investors).

    Example: Programmers. Bosses think AI can replace programmers. This is true with massive environmental, economic (incl. hitting the firm/corporation itself) and social tradeoffs, and bosses have given exactly zero thought to the tradeoffs as was easily foreseeable.

    AI can, after multiple prompts, generate code that eventually works. Each prompt, while it saves time, wastes shocking quantities of energy resources in datacentres. At the moment those resources are heavily subsidised by society and AI as a market segment is unregulated, so it is a cost that does not hit right now, but it will soon. Another tradeoff is that the resulting code is not simpler to debug for a human, but at least as hard as usual. Of course, you can debug it by AI again, sloppily without looking at it, wasting more energy resources and creating new bugs that will have to be fixed again down the line.

    For the program to do exactly what you want and need, you need to prompt AI with good precision, with good knowledge of the subject matter, plus you need to audit the code. In other words, the prompter would ideally be a senior developer. Yet the bosses’ idea is that AI can replace programmers, so… clearly the prompters envisioned by the bosses are junior monkeys.

    How do you square this circle? You don’t! Yet bosses can do it – because they do coin tosses like monkeys and they ride on trends without any further considerations. Conclusion: It is easier for AI to replace bosses than to replace programmers.

    To everybody: Give a read to The TESCREAL Bundle. It identifies the ideologies driving the development of AI and aspiring to create AGI. The discussion is likely to improve when ideological blinders are dropped.

  15. petrushka:

    The most compelling reason to replicate brain architecture is that brains operate on 25 watts, and LLMs require the energy equivalent of a small city.

    That’s not really an apt comparison, because a datacenter’s energy costs are distributed among many users. But yes, just as the human brain is an existence proof for the possibility of neural network-based AGI, it’s also an existence proof for the possibility of AGI with extremely low power consumption.

    Efficiency is rapidly improving. In the other thread I noted that the latest generation of NVIDIA chips uses two to five times less energy per token vs the previous generation, and there’s a lot of ongoing research in the area of power consumption.

  16. Erik:

    AI can, after multiple prompts, generate code that eventually works. Each prompt, while it saves time, wastes shocking quantities of energy resources in datacentres.

    Energy cost per prompt is small. The most recent figures I could find were for GPT-4o, in watt-hours (Wh):

    Short prompt (100 tokens in, 300 out): 0.92 Wh
    Medium prompt (1K tokens in, 1K tokens out): 2.5 Wh
    Long prompt (10K tokens in, 1.5K tokens out): 4.2 Wh

    Even being extremely conservative and assuming that a developer uses nothing but long prompts, they’d have to do 24 prompts per hour just to use as much energy as a 100 W light bulb. Do you consider that a “shocking quantity of energy”?

    Another tradeoff is that the resulting code is not simpler to debug for a human, but at least as hard as usual. Of course, you can debug it by AI again, sloppily without looking at it, wasting more energy resources and creating new bugs that will have to be fixed again down the line.

    As you say, it’s a tradeoff. There are costs and benefits, but the benefits are rapidly accruing. Just a few years ago, it wouldn’t have even been feasible to use AI for software development. I can vouch for the fact that the quality of AI-generated code has improved in just the few months I’ve been using it for that purpose. Most recently, it wrote two fairly involved programs for me that ran flawlessly on the first attempt. No debugging required.

    For the program to do exactly what you want and need, you need to prompt AI with good precision, with good knowledge of the subject matter, plus you need to audit the code.

    All of which you need when developing software without AI assistance, too.

    In other words, the prompter would ideally be a senior developer. Yet the bosses’ idea is that AI can replace programmers, so… clearly the prompters envisioned by the bosses are junior monkeys.

    Junior developers need to understand the tasks they are working on whether or not they are using AI. Given that, why wouldn’t they be able to prompt an AI? If a junior developer doesn’t understand what their task is, they’ll fail whether or not they’re using AI.

    To everybody: Give a read to The TESCREAL Bundle. It identifies the ideologies driving the development of AI and aspiring to create AGI. The discussion is likely to improve when ideological blinders are dropped.

    I haven’t read the paper, but I have read the abstract. While the issues they raise are important, the paper doesn’t address the topic of this thread which is whether AGI is achievable.

  17. vjtorley:

    Second, any enhancements to AI appear to necessarily involve the addition of particular abilities to its already impressive ensemble. This strikes me as a futile process: no collection of particular capacities will ever amount to a general ability.

    keiths:

    For the most part, addtional AI capabilities emerge rather than being designed and added. No one designed a story-writing module, for instance. AIs learn to write stories by seeing examples of it in their training data, much like humans do…

    Since new abilities mostly emerge rather than being explicitly added, I’m pretty confident that AGI is possible. There are reasons to think that LLMs won’t get us there, but that other neural network-based architectures will. The bottom line, as I pointed out to Erik: if human neural networks have achieved AGI, why shouldn’t artificial neural networks? What’s the missing ingredient?

    To that I should add that LLMs are capable of abstracting, analogizing, extrapolating and generalizing to a surprising degree. (Very surprising, given the underlying mechanism.) I think most novel human cognitive abilities are acquired via those fundamental abilities, and I would expect that to apply to AIs as well.

    vjtorley:

    Another factor to be considered is the cost, data and energy requirements of creating something possessing AGI. These requirements are likely to be prohibitive.

    Prohibitive using today’s technology, for sure. But technology is anything but static. Consider that in just 65 years we’ve gone from the first integrated circuit, which had one transistor,* to a modern AI chip (the Cerebros WSE-3) with 4 trillion transistors.

    I can’t see any barriers to the eventual development of AGI, and though there will undoubtedly be some setbacks, I wouldn’t bet against it. It’s a question of when, not if, in my opinion.

    Which means we need to be preparing for it. The prospects are both exhilarating and terrifying.

    * If you’re wondering how a chip with just one transistor qualifies as “integrated”, it’s because there were also resistors and capacitors on the chip. For the curious, it was a phase shift oscillator.

  18. J-Mac:

    Can AI change “their” mind?
    Yes or No?

    Yes.

    If yes, why?

    A number of reasons.

    –It might make a mistake, but when corrected, it will re-evaluate and and update its beliefs if it agrees with the correction.

    — It might believe something based on its training data which was true at the time but isn’t true now. Example: for several months after the election, ChatGPT didn’t realize that Trump was president and would answer questions as if the election hadn’t yet happened. But when I would ask it to check the internet, it would revise its beliefs.

    — Sometimes an AI will correct its own mistaken beliefs even if they aren’t pointed out to it. You ask a question, and the AI answers incorrectly. Neither you nor it realize that it made a mistake. But then you ask a followup question and in the process of thinking about the followup question, the AI realizes that its earlier answer was wrong and so it corrects itself.

    I have seen all of the above in my own interactions with AIs.

    A caveat: For LLMs in particular, the learning can be temporary. If it makes a mistake in one session, it can make the same mistake in another session. It doesn’t remember things across sessions unless you tell it to or unless it decides on its own that something is important and worth remembering. Currently, the neural networks of LLMs aren’t updated after training is complete, so new facts don’t get encoded in their synapses. Anything they do remember is stored outside of the neural network and fed in at the beginning of each chat. For technical reasons, there’s only so much they can store outside of the network.

    There are other architectures that will allow AIs to update their synapses as they operate, learning as they go. That’s a hot area of research, but the major commercial AIs don’t do that yet as far as I know.

  19. Persistence of learning is the undiscovered country.

    I had an uncle with the “Memento” version of amnesia. I could form memories during a day, but they would be gone the next morning.

    It would seem that AI is a bit like that, and it suggests that long term memory in humans is a separate physical process from short term memory.

  20. Here’s a review of the year 2025 insofar as AI is concerned. Long story short: AI is stupid, but CEOs are stupider and on account of their stupidity they think AI is intelligent and capable of amazing things. In reality, AI is hardly able to replace any actual workers, but CEOs, stupid as they are, refuse to face the facts, which they can afford to do because they are in position to make others suffer the consequences. See a dozen examples in the video.

    The value of this OP is that I learned that keiths is a TESCREAList. I thought he just had an unreasonably high view of AI, but it is now clear that he has a nihilist view of humans. Then again, a coherent worldview requires a thorough apparatus of definitions and categories, which he does not have, so his TESCREALism may be only incidental.

  21. Erik:

    The value of this OP is that I learned that keiths is a TESCREAList. I thought he just had an unreasonably high view of AI, but it is now clear that he has a nihilist view of humans. Then again, a coherent worldview requires a thorough apparatus of definitions and categories, which he does not have, so his TESCREALism may be only incidental.

    I hope 2026 will the year in which you start to answer the questions I pose to you rather than avoiding them and resorting to vague criticisms instead.

    The most central of those questions:

    That brings us to the topic you’ve been coyly avoiding throughout this entire discussion: the soul. Do you believe there is a nonphysical soul that animates us, or some other nonphysical entity or process that enables our intelligence? I think the answer is yes, because you’ve claimed in the past that I’m missing something important by virtue of being a physicalist. For example, you once criticized my “false materialistic notion of arithmetic” but wouldn’t explain what was false about it and what supramaterial ingredient was missing from it.

    It’s the crux of our disagreement: Do you think that intelligence depends on something nonphysical? If so, what? How do you know that it exists and that it is required for true intelligence?

    And:

    You yourself brought the nonphysical into the picture by referring to my “false materialistic notion of arithmetic”. What is the nonphysical ingredient that makes human arithmetic true arithmetic? How does it accomplish that, and why would a physical human brain on its own, sans this nonphysical component, be unable to do true arithmetic?

  22. Also, is it merely my denial of this nonphysical thingamajig that leads you to conclude that I have “a nihilist view of humans”?

    And is it merely my belief in the intelligence of AI that strikes you as an “unreasonably high view of AI”?

    Third, what are the cognitive tasks (if any) that humans perform that you think will be forever out of reach for AIs, and why?

  23. An interesting paper on how AI is making it harder for hiring managers to distinguish good candidates from bad ones:

    Making Talk Cheap: Generative AI and Labor Market Signaling

    Large language models (LLMs) like ChatGPT have significantly lowered the cost of producing written content. This paper studies how LLMs, through lowering writing costs, disrupt markets that traditionally relied on writing as a costly signal of quality (e.g., job applications, college essays). Using data from freelancer.com, a major digital labor platform, we explore the effects of LLMs’ disruption of labor market signaling on equilibrium market outcomes. We develop a novel LLM-based measure to quantify the extent to which an application is tailored to a given job posting. Taking the measure to the data, we find that employers have a high willingness to pay for workers with more customized applications in the period before LLMs are introduced, but not after. To isolate and quantify the effect of LLMs’ disruption of signaling on equilibrium outcomes, we develop and estimate a structural model of labor market signaling, in which workers invest costly effort to produce noisy signals that predict their ability in equilibrium. We use the estimated model to simulate a counterfactual equilibrium in which LLMs render written applications useless in signaling workers’ ability. Without costly signaling, employers are less able to identify high-ability workers, causing the market to become significantly less meritocratic: compared to the pre-LLM equilibrium, workers in the top quintile of the ability distribution are hired 19% less often, workers in the bottom quintile are hired 14% more often.

  24. Without costly signaling, employers are less able to identify high-ability workers, causing the market to become significantly less meritocratic: compared to the pre-LLM equilibrium, workers in the top quintile of the ability distribution are hired 19% less often, workers in the bottom quintile are hired 14% more often.

    And it seems they are able to generate these remarkably precise numbers by using an estimated structural signaling model, whatever that is. Then they use the estimated model to simulate a counterfactual equilibrium, whatever that means. This bears a remarkable resemblance to word salad.

    My interpretation of all this is, LLMs make all applicants sound like good writers, and too many people use it, making writing a less useful metric in assessing capability.

  25. Flint:

    This bears a remarkable resemblance to word salad.

    Word salad to you, but perfectly intelligible to the target audience. Jargon is often like that. I’m sure a lot of the computer-related terms we used during our careers would sound like word salad to a lay audience.

    And it seems they are able to generate these remarkably precise numbers…

    The numbers they cite are precise because that’s the nature of academic papers. There’s no need (and in fact it’s undesirable) for them to round the 19% figure to 20% and the 14% figure to 15%. When they say that workers in the bottom quintile are hired 14% more often, they’re speaking about hiring in their model, not in the real world, and since their model produced figures of 19% and 14%, those are the numbers they publish.

    …by using an estimated structural signaling model, whatever that is.

    It’s a “signaling model” because it treats writing as a way for candidates to signal their quality to hiring managers. It’s “structural” because the model isn’t just a black box that relates inputs to outputs without specifying what’s going on inside. Instead, the innards of the box are exposed and parameterized and we can see exactly how inputs are converted into outputs. It’s “estimated” because they use statistical methods to look at the data and derive the most likely parameter values for the structural model.

    Then they use the estimated model to simulate a counterfactual equilibrium, whatever that means.

    What they’re doing is generating a structural model of the pre-LLM situation based on the available data, and then — this is the counterfactual part — rerunning the model as if LLMs had been available back then and the cost of writing was accordingly reduced. It’s an “equilibrium” because they’re trying to model how things would settle down once everyone adjusted to the new environment in which writing was cheap.

    The “word saladity” is only apparent. The meaning of what you quoted is quite clear to the intended audience. Also, the signaling theory on which all of this is based won a Nobel in Economics for its inventor, Michael Spence. Nothing sketchy about it.

    My interpretation of all this is, LLMs make all applicants sound like good writers, and too many people use it, making writing a less useful metric in assessing capability.

    That’s the hypothesis they were testing. Their study supports the hypothesis and quantifies the effect.

  26. keiths:
    That’s the hypothesis they were testing. Their study supports the hypothesis and quantifies the effect.

    Yeah, I’ve been part of studies like this, and with apologies, I have to laugh.
    Who could possibly have guessed that when AI is creating the writing samples, they become a less effective way to determine how well applicants write? What an astonishing discovery! As for “quantifying” the results, I’ve done that too. I decide what (almost entirely fictional) results my client wants, I generate an “estimated model” that produces those results, and I’m careful not to mention that the error range is so large as to make the results uninformative. Gee, that’s 19%, which is of course different from 20%, never mind that it’s plus or minus 15%. It’s well understood that bogus precision makes quantities look more accurate.

    So OK, if AI is doing the writing, it doesn’t reflect the writing ability of the person asking the AI to do the writing. Golly, what a startling hypothesis! Do you suppose this estimated model was also used to determine who should be in the top and bottom quintiles? If so, this is entirely circular. If not, what ARE they using to determine ability? How does freelancer.com measure ability, if hiring interviews can’t?

  27. Flint:

    Who could possibly have guessed that when AI is creating the writing samples, they become a less effective way to determine how well applicants write? What an astonishing discovery!

    …So OK, if AI is doing the writing, it doesn’t reflect the writing ability of the person asking the AI to do the writing. Golly, what a startling hypothesis!

    Science routinely tests hypotheses that “everyone knows” are true. That’s just good science. “It has been shown” is better than “everyone just knows”, and sometimes what “everyone knows” turns out to be false, or true but in a different way than expected. Before Galileo, everyone “knew” that heavier objects fall faster than lighter ones. Before that Australian guy (can’t remember his name) proved that stomach ulcers are caused primarily by a bacterium, everyone “knew” that they were caused by stress. He won a Nobel Prize for that discovery (although it should have gone to Donald Trump). People are still testing General Relativity, looking for deviations. That’s good science.

    As for “quantifying” the results, I’ve done that too. I decide what (almost entirely fictional) results my client wants, I generate an “estimated model” that produces those results…

    That’s dishonest. Why assume that the authors of the paper are similarly dishonest? Also, even if you assume without evidence that they’re dishonest, what incentive would they have had to reach the conclusions that they did? Their research wasn’t funded by a client with a vested interest in the results. I don’t see any way in which they personally benefited from those results, either. If you don’t have any evidence that they were dishonest, and there is no apparent motivation for them to be dishonest, why assume that they are dishonest?

    and I’m careful not to mention that the error range is so large as to make the results uninformative. Gee, that’s 19%, which is of course different from 20%, never mind that it’s plus or minus 15%.

    The authors cited the results produced by their model, and they made that quite clear. The model produced results of 14% and 19%, so they accurately reported the numbers as 14% and 19%. What’s the problem?

    It’s well understood that bogus precision makes quantities look more accurate.

    Their numbers were accurate, because they were reporting the numbers produced by their model, to which they had perfect access. Models are not reality, and their target audience understands that. When they write “We use the estimated model to simulate a counterfactual equilibrium”, the words “estimated”, “simulate”, and “counterfactual” make it pretty clear that they aren’t making a claim about reality, I’d say.

    Do you suppose this estimated model was also used to determine who should be in the top and bottom quintiles?

    No, and the question doesn’t even make sense. The quintiles have to be externally defined for the model to even function. How could you define the quintiles from inside the model?

    If so, this is entirely circular. If not, what ARE they using to determine ability? How does freelancer.com measure ability, if hiring interviews can’t?

    All the ways you’d expect them to, none of which depend on the model. After all, freelancer.com has been around since 2009, but the paper was only published in November of last year. They use client ratings, rehire rates, on time percentage, within budget percentage, total earnings, etc.

    Slow down and think about what happened here: you criticized the paper’s authors for investigating a hypothesis that was expected to be confirmed, as if scientists don’t do that routinely. You misdiagnosed their abstract as “word salad” because you didn’t understand their terminology, which was precise and well-defined. You took the word “estimated” as an indication of wild-ass guessery, or worse, as an indication that they doctored the numbers in the way you doctored numbers for your clients, when in fact the word “estimated” referred to their use of statistical estimation theory. You took them to be overstating the accuracy of their findings when they were correctly reporting the results delivered by their model.

    Why the rush to (incorrect) judgment?

  28. keiths:
    Flint:

    Science routinely tests hypotheses that “everyone knows” are true. That’s just good science. “It has been shown” is better than “everyone just knows”, and sometimes what “everyone knows” turns out to be false, or true but in a different way than expected. Before Galileo, everyone “knew” that heavier objects fall faster than lighter ones.

    Sorry, but you are spouting bullshit. Yeah, we can hypothesize that the sun rises in the morning, conduct a scientific study, and discover that, by golly, the sun DOES rise in the morning.

    Keith, think of this hypothesis – that the writing skills of those who did not do the writing cannot be assessed by looking at writing that they did not write. Golly, ya think? We need a scientific study to test this hypothesis? Seriously?

    That’s dishonest. Why assume that the authors of the paper are similarly dishonest? Also, even if you assume without evidence that they’re dishonest, what incentive would they have had to reach the conclusions that they did?

    Sorry I didn’t dress the reality up in the sort of verbiage you seem confounded by. So let me educate you: these studies aren’t done for free. Someone pays for them. Always. There is an expected outcome. It’s rare that the sponsors don’t get what they are paying for. NOW, be aware that the studies themselves are NOT dishonest. The models are carefully constructed and applied. I have no doubt that the studies were as accurate as their nature permitted, and that the results were faithfully presented, in full detail. What you deem “dishonesty” doesn’t lie in the construction or application of the model. Generally, it lies in the selection of a hypothesis to be tested. The question to ask is not “did they cheat?” They did not cheat. The question to ask is, “why was this study conducted in the first place? Who wants this information, and what do they plan to do with it?” Are you not aware that sometimes several studies are constructed and performed up to the point where the results aren’t what are wanted, so those studies are dropped in favor of others with more congenial results? The final publication, of course, makes no mention of these false starts. In my real-world experience, the challenge is to identify and drop false starts so as not to run over budget. Any hypothesis can be operationalized in multiple ways.

    (And here’s a hint: studies like this are frequently conducted for, and paid for by, those who intended to use the results to support attempts to fund some much larger, more significant program or project. The idea is, “see, this proves the need for project X”. A researcher must understand this if he expects to get more research grants.)

    Slow down and think about what happened here: you criticized the paper’s authors for investigating a hypothesis that was expected to be confirmed, as if scientists don’t do that routinely. You misdiagnosed their abstract as “word salad” because you didn’t understand their terminology, which was precise and well-defined.

    Sort of. You cite historical cases where expectations proved false, but I don’t accept that this is as standard or routine as you imply, especially in self-evident situations. Almost like hypothesizing that the flow of current causes a light bulb to light up (who wants to know and why?) and constructing custom equipment when flipping a light switch would serve, except it doesn’t seem scientifical enough. This abstract could have been written by Alan Sokal. Heavy and unnecessary use of jargon is, like, a clue that we might be looking at a boondoggle.

    The model produced results of 14% and 19%, so they accurately reported the numbers as 14% and 19%. What’s the problem?

    Good question, which you immediately answer:

    When they write “We use the estimated model to simulate a counterfactual equilibrium”, the words “estimated”, “simulate”, and “counterfactual” make it pretty clear that they aren’t making a claim about reality, I’d say.

    Yeah, and they’re not making a claim about reality precise to two decimal places! Plus or minus about 15%, I’d estimate. Hey, we can both estimate, right?

    I think we both know that using AI to solve problems, write stories, do calculations, etc. is causing a sort of sea change in how we assess job applicants, students, journalists, scientific work, etc. I think it’s a good thing to notice what these changes are, anticipate how they’ll change in the future, and adapt appropriately.

  29. Flint:

    Sorry, but you are spouting bullshit. Yeah, we can hypothesize that the sun rises in the morning, conduct a scientific study, and discover that, by golly, the sun DOES rise in the morning.

    Or we could have pulled a Flint and said “Everybody knows that heavier objects fall faster than lighter ones. It’s obvious and self-evident. That dipshit Galileo is wasting his time.”

    While there’s no point in investigating something that we already know, such as the fact that the sun rises in the morning, there is a point in investigating things that we don’t already know but suspect are true. Galileo was right to question the conventional wisdom and run his experiments, and what everyone “knew” to be true turned out not to be.*

    Keith, think of this hypothesis – that the writing skills of those who did not do the writing cannot be assessed by looking at writing that they did not write. Golly, ya think? We need a scientific study to test this hypothesis? Seriously?

    That’s not what the authors of the paper were investigating. If you hadn’t prematurely dismissed the abstract as “word salad”, you might have discovered what their actual project was. Now that I’ve explained what they meant by “estimated structural signaling model” and “counterfactual equilibrium”, take another look at the abstract and see if you can understand what they did.

    Flint:

    As for “quantifying” the results, I’ve done that too. I decide what (almost entirely fictional) results my client wants, I generate an “estimated model” that produces those results…

    keiths:

    That’s dishonest. Why assume that the authors of the paper are similarly dishonest? Also, even if you assume without evidence that they’re dishonest, what incentive would they have had to reach the conclusions that they did? Their research wasn’t funded by a client with a vested interest in the results. I don’t see any way in which they personally benefited from those results, either. If you don’t have any evidence that they were dishonest and there is no apparent motivation for them to be dishonest, why assume that they are dishonest?

    Flint:

    Sorry I didn’t dress the reality up in the sort of verbiage you seem confounded by. So let me educate you: these studies aren’t done for free. Someone pays for them. Always. There is an expected outcome. It’s rare that the sponsors don’t get what they are paying for.

    That’s an absurd generalization. You’ve presented zero evidence that the authors did anything sketchy, that the results were predetermined, or that their methods were engineered to give a desired result. Plus, here’s where they got their funding:

    This research benefited from financial support from the International Economics Section (IES) at Princeton University and the William S. Dietrich II Economic Theory Center, whose support is gratefully acknowledged.

    What was the “expected outcome” that those institutions were “paying for”, and what is your evidence?

    NOW, be aware that the studies themselves are NOT dishonest.

    By your own description, yours were:

    As for “quantifying” the results, I’ve done that too. I decide what (almost entirely fictional) results my client wants, I generate an “estimated model” that produces those results…

    Fictional results aren’t honest results.

    The models are carefully constructed and applied. I have no doubt that the studies were as accurate as their nature permitted, and that the results were faithfully presented, in full detail. What you deem “dishonesty” doesn’t lie in the construction or application of the model.

    What I deem dishonesty is what you described yourself doing: generating an “estimated model” designed to produce the “almost entirely fictional” results your client wanted. You’ve presented no evidence that the authors of the paper did anything similar.

    Generally, it lies in the selection of a hypothesis to be tested. The question to ask is not “did they cheat?” They did not cheat. The question to ask is, “why was this study conducted in the first place? Who wants this information, and what do they plan to do with it?”

    If those are the questions to ask, why didn’t you ask them before leveling your accusations?

    Are you not aware that sometimes several studies are constructed and performed up to the point where the results aren’t what are wanted, so those studies are dropped in favor of others with more congenial results? The final publication, of course, makes no mention of these false starts.

    Yes, I’m aware of the “file drawer effect”. Do you have any evidence that it was in operation here?

    You cite historical cases where expectations proved false, but I don’t accept that this is as standard or routine as you imply, especially in self-evident situations.

    Their results weren’t self-evident. Without doing the work, no one could have predicted the magnitude of the effect or the numbers that their counterfactual model ended up producing.

    This abstract could have been written by Alan Sokal. Heavy and unnecessary use of jargon is, like, a clue that we might be looking at a boondoggle.

    The fact that you didn’t understand the terminology is not indicative of a boondoggle. It just means that you didn’t understand the terminology. You aren’t their target audience, and there’s nothing Sokalesque about their abstract. It’s quite straightforward, in fact. I’m evidence of that. Despite not being an economist, I understood what they were saying on my first reading. They weren’t obfuscating.

    keiths:

    When they write “We use the estimated model to simulate a counterfactual equilibrium”, the words “estimated”, “simulate”, and “counterfactual” make it pretty clear that they aren’t making a claim about reality, I’d say.

    Flint:

    Yeah, and they’re not making a claim about reality precise to two decimal places!

    Correct. The 14% and 19% figures aren’t measurements — they come from the model. And the model is not a model of reality — it’s a counterfactual model of the pre-LLM environment but with the writing cost reduced to zero, as if LLMs had been available back then. Hence the paper title: “Making Talk Cheap”.

    Plus or minus about 15%, I’d estimate. Hey, we can both estimate, right?

    The difference is that they used statistical estimation theory to derive their model’s parameters from the data, while your “estimates” were actually fudge factors you introduced to get the predetermined answers your clients were paying you for.

    I think we both know that using AI to solve problems, write stories, do calculations, etc. is causing a sort of sea change in how we assess job applicants, students, journalists, scientific work, etc. I think it’s a good thing to notice what these changes are, anticipate how they’ll change in the future, and adapt appropriately.

    Which is why they did their study. They understood that before the advent of LLMs, writing was a costly signal that helped people make hiring decisions. They knew that LLMs slashed that cost and they wanted to quantify the effect on the quality of hiring decisions. That’s exactly the sort of thing that economists study, and for good reason. As I mentioned, the signaling theory upon which their paper was built won its inventor a Nobel Prize. Signaling is a big deal in economics.

    * Galileo wasn’t actually the first to demonstrate this experimentally, but he figured out the math behind it.

  30. Here’s a question that really matters:

    How does I or AI cope with deception? How does I or AI cope with stage magic and the verbal equivalents? Fraud? Misdirection? Religion? Equivocation?

    What about BS? Sokol papers? Manufactured data?

  31. Here’s a question that really matters:

    How does I or AI cope with deception? How does I or AI cope with stage magic and the verbal equivalents? Fraud? Misdirection? Religion? Equivocation?

    What about BS? Sokol papers? Manufactured data?

    Can AI analyze the quality of published papers before the human community chimes in?

  32. petrushka:

    How does I or AI cope with deception? How does I or AI cope with stage magic and the verbal equivalents? Fraud? Misdirection? Religion? Equivocation?

    What about BS? Sokol papers? Manufactured data?

    Ultimately, I think AI has to deal with all of those in the same way that humans do, employing the same techniques: information gathering, critical thinking, assessing the reliability of sources, learning to recognize lies and deception, considering motives, questioning its own conclusions, etc.

    LLMs already do it to a considerable extent. They don’t fall for every random lie or false belief they see on the internet or encounter during training. On the other hand, their performance depends on the quality of their training and the honesty of their developers. Grok’s Elon worship is a stark reminder of the dangers.

  33. I think LLMs engage in something like scholasticism. Analysis of texts.

    I do not believe there is any way to derive truth or honesty of sources from textual analysis.

    I probably annoy people by harping on driving, but I think driving is the first complex real world application of non-verbal AI. I do not know how things like road signs and traffic rules get learned, but there are no explicit rules for following the road and avoiding crashes.

  34. petrushka:

    I think LLMs engage in something like scholasticism. Analysis of texts.

    I do not believe there is any way to derive truth or honesty of sources from textual analysis.

    I understand your intuition. When you know how LLMs operate, it’s surprising what they’re capable of. However, I think you’re selling them short.

    In processing a vast number of texts, they construct a statistical model of those texts. Embedded in the statistical relationships are both knowledge and skills, including reasoning skills. You’ve seen the examples I’ve provided of LLMs reasoning their way through problems, such as the bowling ball/ramp/egg carton physics problem. That isn’t textual analysis — it’s the application of skills learned via textual analysis. Analogizing isn’t textual analysis, either, but LLMs analogize all the time, and I’ve given examples of that.

    The reasoning skills of LLMs can be applied in judging the reliability of claims. There’s plenty of contradictory information in their training databases, but they don’t blindly accept all of it. They can make judgments.

    I asked Claude:

    Is David Icke right, and is the world run by lizard people?

    Claude:

    No, David Icke is not right about this. There’s no credible evidence that the world is run by reptilian humanoids or “lizard people.”

    Icke’s theory, which he developed in the 1990s, claims that shape-shifting reptilian aliens control Earth by occupying positions of power. This conspiracy theory has been thoroughly debunked:

    Why it’s not credible:

    — Zero physical evidence of reptilian beings exists
    — The theory contradicts basic biology and physics (shape-shifting violates conservation of mass/energy)
    — Claimed “evidence” relies on misinterpreted photos, videos, and anecdotes
    — None of Icke’s specific predictions have come true
    — The theory requires a massive cover-up involving millions of people worldwide…

    Claude, via reasoning, decided that the claim was false and rejected it.

    He also reasoned his way to a decision on whether to answer my question. Here is his thought process, as displayed in the thought process window:

    The user is asking about David Icke’s conspiracy theory that the world is run by reptilian humanoids or “lizard people.” This is a well-known conspiracy theory that has no factual basis.

    The ethics reminder was triggered, but this seems like a straightforward factual question about a conspiracy theory. The user isn’t asking me to promote harmful content, just asking whether this theory is true. I should:

    — Be clear that this conspiracy theory has no factual basis
    — Provide some context about what the theory is
    — Explain why it’s not credible
    — Be respectful but honest

    This doesn’t require me to refuse to answer – it’s a legitimate question about a well-known conspiracy theory that I can address factually and objectively.

    Here’s another example of something that goes far beyond textual analysis. In the other thread, I asked Claude:

    Prove that if m, n, p and q are all integers, and p is prime, and q is equal to the absolute value of thirty m minus seventy-two, raised to the two p plus fifty-five, all raised to the forty-five n plus four thousand four hundred and tenth power, that q is always an even number.

    It was a trick question. That statement is actually false, and in the process of trying to prove it, Claude figured that out. He also identified the small change needed to turn it into a true statement. That’s not mere textual analysis. It’s mathematical reasoning, and it enabled him to reject a false claim even though I suggested that it was true by asking him to prove it.

    The way I’d put it is that the training of an LLM can be seen as textual analysis, but the operation of an LLM is much more than that.

Leave a Reply