Audio long read: How does ChatGPT 'think'? Psychology and neuroscience crack open AI large language models

Primary Topic

This episode delves into the complexities of large language models (LLMs) like ChatGPT, exploring the intersection of psychology, neuroscience, and artificial intelligence.

Episode Summary

In this insightful episode, experts from various fields unravel the intricate workings of AI systems, particularly large language models (LLMs), which are becoming increasingly central in tasks ranging from medical advice to academic writing. Despite their utility, these models pose challenges due to their "black box" nature, where the reasoning behind their outputs remains largely opaque. The episode explores advancements in explainable AI (XAI), which aims to make these systems more transparent and understandable. Techniques like "chain of thought" prompting are discussed alongside their limitations and potential misleading nature. The conversation also touches on the anthropomorphic ways these models are analyzed, likening their examination to psychological studies on human behavior.

Main Takeaways

LLMs, including ChatGPT, operate as black boxes with workings largely unknown even to their creators.
Explainable AI (XAI) is developing to help understand and trust AI decisions, particularly in critical applications.
Techniques like "chain of thought" prompting reveal AI's reasoning but can sometimes provide misleading explanations.
Comparing AI behavior to human psychological processes offers intriguing insights but has its limitations.
The field is pushing for more accountability and transparency from AI developers, especially for high-risk applications.

Episode Chapters

1. Introduction to AI's Complexity

Overview of the episode's focus on the complexity and implications of understanding large language models. Benjamin Thompson: "These AI systems are inscrutably complex, functioning without predefined rules."

2. Exploring Explainable AI

Discussion on the techniques and importance of making AI systems more transparent through XAI. Martin Wattenberg: "Understanding LLM behavior could bridge the gap to understanding human cognitive processes."

3. Challenges in AI Transparency

Insights into the inherent challenges and the ongoing efforts in making AI's decision-making process clear. Moore Gever: "The quest for explainable AI is crucial as LLMs take on more significant roles."

4. Anthropomorphism in AI

Comparison of AI analysis techniques to human psychological methods, highlighting both benefits and drawbacks. Thielo Hagendorff: "Treating LLMs like human subjects can reveal complex behaviors from simple computations."

Actionable Advice

Approach AI with cautious optimism: Recognize its utility while being aware of its limitations and potential errors.
Educate yourself on AI capabilities: Understanding basic AI functions can help in critically assessing its outputs.
Use AI responsibly: Ensure its applications are ethical and do not reinforce biases.
Demand transparency: Advocate for clear explanations of AI decisions, especially in high-stakes scenarios.
Stay updated: AI technology evolves rapidly, so continuous learning is essential for both users and regulators.

About This Episode

AIs are often described as 'black boxes' with researchers unable to to figure out how they 'think'. To better understand these often inscrutable systems, some scientists are borrowing from psychology and neuroscience to design tools to reverse-engineer them, which they hope will lead to the design of safer, more efficient AIs.

People

Martin Wattenberg, Moore Gever, Thielo Hagendorff

Companies

Northeastern University, Tel Aviv University, Harvard University

Books

None

Guest Name(s):

None

Content Warnings:

None

Transcript

Nature Podcast
The Nature podcast is supported by Nature Plus, a flexible monthly subscription that grants immediate online access to the science journal Nature and over 50 other journals from the Nature portfolio.

More information at go dot nature.com plus.

Yahoo Finance
When it comes to your finances, you think you've done it all. You've saved, you've researched, and you've invested all that you can. Now it's time to take those investments to the next level by using the brand behind every great investor. Yahoo Finance as America's number one finance destination, Yahoo Finance has everything you need, whether you're a seasoned trader or just dipping your toes into the market. Join the millions of investors who trust Yahoo Finance to guide them on their financial journey. For comprehensive financial news and analysis, visit yahoofinance.com comma. The number one financial destination, yahoofinance.com dot.

Benjamin Thompson
This is an audio long read from nature. In this episode, how does chat, GPT think? Psychology and neuroscience crack open AI large language models written by Matthew Hudson and read by me Benjamin Thompson David Bau is very familiar with the idea that computer systems are becoming so complicated, it's hard to keep track of how they operate.

I spent 20 years as a software engineer working on really complex systems, and there's always this problem, says Bao, a computer scientist at Northeastern University in Boston, Massachusetts.

But with conventional software, someone with inside knowledge can usually deduce what's going on, Bow says.

If a website's ranking drops in a Google search, for example, someone at Google, where bow works for a dozen years will have a good idea why.

Here's what really terrifies me about the current breed of artificial intelligence, or AI, he says. There is no such understanding, even among the people building it.

The latest wave of AI relies heavily on machine learning, in which software identifies patterns in data on its own without being given any predetermined rules as to how to organize or classify the information.

These patterns can be inscrutable to humans. The most advanced machine learning systems use neural networks, software inspired by the architecture of the brain. They simulate layers of neurons, which transform information as it passes from layer to layer. As in human brains, these networks strengthen and weaken neural connections as they learn. But it's hard to see why certain connections are affected.

As a result, researchers often talk about AI and as black boxes, the inner workings of which are a mystery.

In the face of this difficulty, researchers have turned to the field of explainable AI, or XAI, expanding its inventory of tricks and tools to help reverse engineer AI systems.

Standard methods include, for example, highlighting the parts of an image that led an algorithm to label it as a cat, or getting software to build a simple decision tree that approximates an AI's behavior. This helps to show why, for instance, the AI recommended that a prisoner be paroled or came up with a particular medical diagnosis. These efforts to peer inside the black box have met with some success, but Xai is still very much a work in progress.

The problem is especially acute for large language models, or LLMs, the machine learning programs that power chatbots such as chat GPT.

These AI's have proved to be particularly inexplicable. In part because of their size, LLMs can have hundreds of billions of parameters, the variables that the AI uses internally to make decisions.

XaI has rapidly grown in the past few years, especially since LLMs have started to emerge, says Moore Gever, a computer scientist at Tel Aviv University in Israel.

These inscrutable models are now taking on important tasks.

People are using LLMs to seek medical advice, write computer code, summarise the news, draft academic papers, and much more. Yet it is well known that such models can generate misinformation, perpetuate social stereotypes, and leak private information.

For these reasons, Xai tools are being devised to explain the workings of LLMs.

Researchers want explanations so that they can create safer, more efficient, and more accurate. AI users want explanations so that they know when to trust the chatbots output. And regulators want explanations so that they know what AI guardrails to put in place.

Martin Wattenberg, a computer scientist at Harvard University in Cambridge, Massachusetts, says that understanding the behaviour of LLMs could even help us to grasp what goes on inside our own heads.

Researchers have called LLMs stochastic parrots, meaning that the models write by probabilistically combining patterns of text theyve encountered before without understanding the substance of what theyre writing.

But some say more is going on, including reasoning and other startlingly human like abilities.

Its also the case that LLMs can behave erratically.

Last year, the chatbot, built into Microsoft's search tool bing, famously declared its love for the technology columnist Kevin Roos, and seemed to try to break up his marriage.

A team at the AI company Anthropic, based in San Francisco, California, highlighted the reasoning powers of AI in a 2023 study that attempts to unpick why a chatbot says what it says.

Anthropics researchers scaled up a common approach to probe an LLM that had 52 billion parameters to reveal which bits of the training data it used while answering questions. When they asked their LLM whether it consented to being shut down, they found it drew on several source materials with the theme of survival. To compose a compelling response, the researchers described the model's behavior as role playing, doing more than parroting but less than planning.

Some researchers also think that these neural networks can construct models of the world, fleshed out visions of the 3d reality that gave rise to their training data.

Harvard University computer scientist Kenneth Lee, working with Bau Wattenberg and others, trained an LLM from scratch to play the board game Othello, in which opponents placed black and white discs on a grid.

The researchers fed their model, called Othello, GPT sequences of moves in text form from past games until it learnt to predict the likely next moves. The team successfully trained a smaller model to interpret the internal activations of the AI and discovered that it had constructed an internal map of the discs based on the text descriptions of the gameplay.

The key insight here is that often it's easier to have a model of the world than not to have a model of the world, Wattemberg says.

Because chatbots can chat, some researchers interrogate their workings by simply asking the models to explain themselves.

This approach resembles those used in human psychology.

The human mind is a black box, animal minds a kind of a black box, and LLMs are black boxes, says Thielo Hagendorff, a computer scientist at the University of Stuttgart in Germany. Psychology is well equipped to investigate black boxes, he says.

Last year, Hagendorff posted a preprint about machine psychology in which he argued that treating an LLM as a human subject by engaging conversation can illuminate sophisticated behaviors that emerge from simple underlying calculations.

A 2022 study by a team at Google introduced the term chain of thought prompting to describe one method for getting LLMs to show their thinking.

First, the user provides a sample question and demonstrates how they would reason their way, step by step to an answer before asking their real question.

This prompts the model to follow a similar process.

It outputs its chain of thought and, as some studies show, it's also more likely to obtain the correct answer than it would otherwise.

However, Sam Bowman, a computer scientist at New York University and Anthropic, and his colleagues showed last year that chain of thought explanations can be unfaithful indicators of what a model is really doing.

The researchers first intentionally biased their study models by, say, giving them a series of multiple choice questions for which the answer was always option a. The team then asked a final test question.

The model usually answered a whether correct or not, but almost never said that they chose this response because the answer is usually a. Instead, they fabricated some logic that led to their responses, just as humans often do consciously or unconsciously, this phenomenon is similar to the implicit social bias that sometimes makes recruiters hire candidates who look or act like them, even while they proclaim that the successful applicant was simply the most qualified for the job.

Bowman's paper showed similar social bias in LLMs.

Yet all of this doesn't mean the chain of thought technique is pointless, says Sandra Wachter, who studies technology regulation at the Oxford Internet Institute, part of the University of Oxford, UK.

I think it can still be useful, she says.

But users should come to chatbots with caution, in the same way that when you're talking to a human, you have some healthy distrust, she says.

It's a little weird to study LLMs the way we study humans, Bao says.

But although there are limits to the comparison, the behavior of the two overlaps in surprising ways.

Numerous papers in the past two years have applied human questionnaires and experiments to LLMs, measuring the machine equivalents of personality, reasoning bias, moral values, creativity, emotions, obedience, and theory of mind, an understanding of the thoughts, opinions, and beliefs of others or oneself.

In many cases, machines reproduce human behavior. In other situations, they diverge.

For instance, Hagendorff, Bau, and Bowman each note that LLMs are more suggestible than humans. Their behavior will morph drastically depending on how a question is phrased.

It is nonsensical to say that an LLM has feelings, Hagendorff says.

It is nonsensical to say that it is self aware or that it has intentions. But I dont think its nonsensical to say that these machines are able to learn or to deceive.

Other researchers are taking tips from neuroscience to explore the inner workings of LLMs to examine how chatbots deceive. Andy Zoe, a computer scientist at Carnegie Mellon University in Pittsburgh, Pennsylvania, and his collaborators interrogated LLMs and looked at the activation of their neurons.

What we do here is similar to performing a neuroimaging scan for humans, so says it's also a bit like designing a lie detector.

The researchers told their LLM several times to lie or to tell the truth, and measured the differences in patterns of neuronal activity, creating a mathematical representation of truthfulness.

Then, whenever they asked the model a new question, they could look at its activity and estimate whether it was being truthful with more than 90% accuracy in a simple lie detection task.

Zoe says that such a system could be used to detect LLMs dishonesty in real time, but he would like to see its accuracy improved first.

The researchers went further and intervened in their models behaviour, adding these truthfulness patterns to its activations when asking it a question, enhancing its honesty.

They followed these steps for several other concepts, too. They could make the model more or less power seeking, happy, harmless, gender biased, and so on.

Bao and his colleagues have also developed methods to scan and edit AI neural networks, including a technique they call causal tracing.

The idea is to give a model a prompt such as Michael Jordan plays the sport of and let it answer basketball, then give it another prompt such as blah blah blah plays the sport of and watch it say something else.

They then take some of the internal activations resulting from the first prompt and variously restore them until the model says basketball in reply to the second prompt to see which areas of the neural network are crucial for that response.

In other words, the researchers want to identify the parts of the AI's brain that make it answer in a given way.

The team developed a method to edit the model's knowledge by tweaking specific parameters and another method to edit in bulk what the model knows.

The methods, the team says, should be handy when you want to fix incorrect or outdated facts without retraining the whole model.

Their edits were specific, they didn't affect facts about other athletes, and yet generalized well. They affected the answer even when the question was rephrased.

The nice thing about artificial neural networks is that we can do experiments that neuroscientists would only dream of, Bow says.

We can look at every single neuron, we can run networks millions of times, we can do all sorts of crazy measurements and interventions and abuse these things, and we don't have to get a consent form, he says. This work got attention from neuroscientists hoping for insights into biological brains.

Peter Harsey, a computer scientist at the University of North Carolina in Chapel Hill, thinks that causal tracing is informative but doesn't tell the whole story.

He has done work showing that a model's response can be changed by editing layers, even outside those identified by causal tracing, which is not what had been expected.

Although many LLM scanning techniques, including Zo's and bows, take a top down approach, attributing concepts or facts to underlying neural representations, others use a bottom up approach, looking at neurons and asking what they represent.

A 2023 paper by a team at anthropic has gained attention because of its fine grained methods for understanding LLMs at the single neuron level.

The researchers looked at a toy AI with a single transformer layer. A large LLM has dozens. When they looked at a sub layer containing 512 neurons, they found that each neuron was polysemantic, responding to a variety of inputs. By mapping when each neuron was activated, they determined that the behavior of those 512 neurons could be described by a collection of 4096 virtual neurons that each lit up in response to just one concept. In effect, embedded in the 512 multitasking neurons were thousands of virtual neurons with more singular roles, each handling one type of task.

This is all really exciting and promising research, Hasi says, for getting into the nuts and bolts of what an AI is doing. It's like we can open it up and pour all the gears on the floor, says Chris Olar, a co founder of Anthropic. But examining a toy model is a bit like studying fruit flies to understand humans. Although valuable, Zoe says, the approach is less suited to explaining the more sophisticated aspects of AI behavior. While researchers continue to struggle to work out what AI is doing, there is a developing consensus that companies should at least be trying to provide explanations for their models, and that regulations should be in place to enforce that. Some regulations do require that algorithms be explainable. The European Union's AI act, for example, requires explainability for high risk AI systems, such as those deployed for remote biometric identification, law enforcement, or access to education, employment, or public services.

Wachter says that LLMs aren't categorised as high risk and might escape this legal need for explainability except in some specific use cases.

But this shouldn't let the makers of LLMs entirely off the hook, says Bao, who takes umbrage over how some companies, such as OpenAI, the firm behind chat GPT, maintain secrecy around their largest models.

OpenAI told Nature it does so for safety reasons, presumably to help prevent bad actors from using details about how the model works to their advantage.

Companies including OpenAI and anthropic are notable contributors to the field of Xai.

In 2023, for example, OpenAI released a study that used GPT four, one of its most recent AI models, to try to explain the responses of an earlier model, GPT-2 at the neuron level. But a lot more research remains to be done to unpack how chatbots work, and some researchers think that the companies that release LLMs should ensure that happens.

Someone needs to be responsible for either doing the science or enabling the science, Bao says, so that it's not just a big ball of lack of responsibility.

To read more of nature's long form journalism, head over to nature.com newsfeed.

Nature Podcast
The Nature podcast is supported by Nature, plus a flexible monthly subscription that grants immediate online access to the science journal Nature and over 50 other journals from the Nature portfolio more information at go dot nature.com.

Yahoo Finance
Plus, when it comes to your finances, you think you've done it all. You've saved, you've researched, and you've invested all that you can. Now it's time to take those investments to the next level by using the brand behind every investor. Yahoo. Finance as America's number one finance destination, Yahoo. Finance has everything you need, whether you're a seasoned trader or just dipping your toes into the market. Join the millions of investors who trust Yahoo. Finance to guide them on their financial journey. For comprehensive financial news and analysis, visit yahoofinance.com. comma. The number one financial destination yahoofinance.com.