The most notable results came from research into GPT-4’s ability to solve math problems. Over the course of the study researchers found that in March GPT-4 was able to correctly identify that the number 17077 is a prime number 97.6% of the times it was asked. But just three months later, its accuracy plummeted to a lowly 2.4%.
The vastly different results from March to June and between the two models reflect not so much the model’s accuracy in performing specific tasks, but rather the unpredictable effects of changes in one part of the model on others.
“When we are tuning a large language model to improve its performance on certain tasks, that can actually have a lot of unintended consequences, which might actually hurt this model’s performance on other tasks,” Zou said in an interview with Fortune. “There’s all sorts of interesting interdependencies in how the model answers things which can lead to some of the worsening behaviors that we observed.”
Oh yes, “unintended consequences” when ostensibly trying to improve the algorithm. All bow down to the algorithm.
Oh yes, “unintended consequences” when ostensibly trying to improve the algorithm. All bow down to the algorithm.
Being much better at some things than others doesn’t equate to “stupid”. It seems that GPT-4 has run into what people experience all the time – you get better at what you practice, at the expense of getting worse at what you stop practicing.
I know from decades of experience that not only is finding bugs in computer programs a fine art that takes a lot of intuition, but that fixing those bugs without causing other problems somewhere else is even harder. I have spent weeks tracking down serious errors caused by someone else fixing an error in some other part of the code. But it’s not all algorithm – human actions have been having unintended consequences since forever. How many of us have thought “if only I had known what that would lead to…”
Flint: Being much better at some things than others doesn’t equate to “stupid”.
Of course I said it with tongue in cheek. From my perspective, computers are not smart in the first place, so they also cannot get stupid.
But from your perspective, since AI is intelligent, it should be possible for it to get stupid. And yes, concretely stupid, because AI has a certain job to do and it fails at it. For you, this should be an unambiguously stupid AI. For me, it’s a machine that doesn’t work.
Erik: Of course I said it with tongue in cheek. From my perspective, computers are not smart in the first place, so they also cannot get stupid.
As we understand the term, computers are not smart. They are machines. Whether software can be smart is another story.
But from your perspective, since AI is intelligent, it should be possible for it to get stupid. And yes, concretely stupid, because AI has a certain job to do and it fails at it. For you, this should be an unambiguously stupid AI. For me, it’s a machine that doesn’t work.
No, the computer works just fine. As for “has a certain job to do and fails at it”, this claim needs a LOT of context. If the task is to defeat the world chess champion consistently, a computer succeeded completely. If the task is to defeat the world go champion (a much more complex game than chess), well, a computer did that too.
Historically we have the Turing test, where the task is to imitate a human so convincingly that a real person cannot tell if he’s talking to another real person or to a computer. For most subject areas, the current GPT models do a damn good job of this – college professors are no longer able to tell if reports and essays are written by AI or by the student, most of the time. Increasingly, professors are requiring their students to write their material right in class.
Now, an AI specialist can tell the difference most of the time, by sticking to a specialized field the computer was not trained with. Another way to identify the AI is to ask for information many humans get wrong. If whatever you’re talking to agrees, for example, that vaccines don’t work, you know you are talking to a person. Computer are smarter than that. People can be preposterously stupid in ways AI cannot yet imitate. Just try getting sense out of someone attending a Trump rally.
I spent less than half an hour chatting with GPT4, but can say it is fairly easy to identify if you attempt an iterative discussion.
What’s remarkable about LLM is not its omniscience, but its ability to chat at all.
For example, I asked what Douglas Adams meant by “if you can’t see me, then I can’t see you.”
The reply correctly identified the source of the quote and that it was humor based on absurdist logic. But it was unable to elaborate on why it is funny. In GPT’s defense, most humans don’t get it either. I’ve tried it with humans.
Despite the quantity of material used for LLM training, it seems not to have included much fiction. Nor did it include the source being parodied by Adams.
Nevertheless, LLM seems to have mastered grammar and syntax, despite having no explicit rules hard coded. I’m told by a heavy user that it generates text purely by statistical means, producing the most probable word based on context. It s remarkable that grammar and syntax seem to self assemble out of probabilities.
When I tried follow up questions, I got rambling replies.
I thought, I’ve seen this before. This is how I answer test questions when I haven’t read the book.
I am certain that GPT is not ready to replace humans in most areas, but if you need some plausible sounding text on nearly any subject, done quickly, without grammatical errors, GPT’s your guy. The more thoroughly the subject is covered by Wikipedia, the more accurate will be the response.
It’s still a huge and nontrivial leap in parsing human language.
The author says this video took two years to make. I believe him.
A recent study has found that AI goes back to stupid.
Oh yes, “unintended consequences” when ostensibly trying to improve the algorithm. All bow down to the algorithm.
Being much better at some things than others doesn’t equate to “stupid”. It seems that GPT-4 has run into what people experience all the time – you get better at what you practice, at the expense of getting worse at what you stop practicing.
I know from decades of experience that not only is finding bugs in computer programs a fine art that takes a lot of intuition, but that fixing those bugs without causing other problems somewhere else is even harder. I have spent weeks tracking down serious errors caused by someone else fixing an error in some other part of the code. But it’s not all algorithm – human actions have been having unintended consequences since forever. How many of us have thought “if only I had known what that would lead to…”
Of course I said it with tongue in cheek. From my perspective, computers are not smart in the first place, so they also cannot get stupid.
But from your perspective, since AI is intelligent, it should be possible for it to get stupid. And yes, concretely stupid, because AI has a certain job to do and it fails at it. For you, this should be an unambiguously stupid AI. For me, it’s a machine that doesn’t work.
As we understand the term, computers are not smart. They are machines. Whether software can be smart is another story.
No, the computer works just fine. As for “has a certain job to do and fails at it”, this claim needs a LOT of context. If the task is to defeat the world chess champion consistently, a computer succeeded completely. If the task is to defeat the world go champion (a much more complex game than chess), well, a computer did that too.
Historically we have the Turing test, where the task is to imitate a human so convincingly that a real person cannot tell if he’s talking to another real person or to a computer. For most subject areas, the current GPT models do a damn good job of this – college professors are no longer able to tell if reports and essays are written by AI or by the student, most of the time. Increasingly, professors are requiring their students to write their material right in class.
Now, an AI specialist can tell the difference most of the time, by sticking to a specialized field the computer was not trained with. Another way to identify the AI is to ask for information many humans get wrong. If whatever you’re talking to agrees, for example, that vaccines don’t work, you know you are talking to a person. Computer are smarter than that. People can be preposterously stupid in ways AI cannot yet imitate. Just try getting sense out of someone attending a Trump rally.
I spent less than half an hour chatting with GPT4, but can say it is fairly easy to identify if you attempt an iterative discussion.
What’s remarkable about LLM is not its omniscience, but its ability to chat at all.
For example, I asked what Douglas Adams meant by “if you can’t see me, then I can’t see you.”
The reply correctly identified the source of the quote and that it was humor based on absurdist logic. But it was unable to elaborate on why it is funny. In GPT’s defense, most humans don’t get it either. I’ve tried it with humans.
Despite the quantity of material used for LLM training, it seems not to have included much fiction. Nor did it include the source being parodied by Adams.
Nevertheless, LLM seems to have mastered grammar and syntax, despite having no explicit rules hard coded. I’m told by a heavy user that it generates text purely by statistical means, producing the most probable word based on context. It s remarkable that grammar and syntax seem to self assemble out of probabilities.
When I tried follow up questions, I got rambling replies.
I thought, I’ve seen this before. This is how I answer test questions when I haven’t read the book.
I am certain that GPT is not ready to replace humans in most areas, but if you need some plausible sounding text on nearly any subject, done quickly, without grammatical errors, GPT’s your guy. The more thoroughly the subject is covered by Wikipedia, the more accurate will be the response.
It’s still a huge and nontrivial leap in parsing human language.