AI and health part one: DrGPT will see you now

Peter Kaczykhemeti
Hello. This is a free episode of Babbage. To listen to the show every week, you'll need to be an economist subscriber. For more information, search online for Economist podcasts plus, or click the link in the show notes Ryan Reynolds here from Mint Mobile. With the price of just about everything going up during inflation, we thought we'd bring our prices down.

ChatGPT
So to help us, we brought in. A reverse auctioneer, which is apparently a thing. Mint Mobile Unlimited Premium wireless Everybody get. 30, 30 Biddy Get 30 bid to get 20. 2020 Biddy get 2020 a bid to get 15.

Alok Jha
1515 just $15 a month, so give it a try@mintmobile.com. switch $45 upfront for three months, plus taxes and fees. Promoting for new customers for limited time unlimited, more than 40gb per month slows full terms at mint mobile.com dot.

The Economist.

Gerald Lipp
The process of breast screening is where two human readers look at the mammograms, and then they make a decision on that and if they disagree. As a third human reader, Gerald Lipp. Is a radiologist in charge of screening for breast cancer in the northeast of Scotland. Gerald and his team recently wanted to find out how an artificial intelligence tool could help their service. We had 10,000 women go through the system and where two humans said there was nothing seen on the mammograms.

This is when the AI kicked in and the AI would give us a reading and said, there's something on this mammogram, and they'll have a circle on the mammogram and say, would you like to look at this area a bit more closely?

One of the very first few patients we did was a lady named Barbara, and Barbara was a lady who was reaching her seventies, so this would have been her last time coming to the screening program. We wouldn't have seen her again. And the AI did highlight a very subtle area in the right breast, which two human readers had already said was okay, and they weren't worried about it. The expert readers, ourselves, with the help of the AI, augmented by it, looked at this area again and said, let's give her a chance. Let's call her back and take a look at this.

I think I was the one, actually who did her ultrasound and her biopsy, found a tiny little thing, probably about the size of 5 mm, in the breast. And when the results came back, lo and behold, Barbara was one of eleven women that we managed to pick up using the AI as an extra reader or a failsafe where the humans had said this was normal and in her case it was a significant cancer, because otherwise it would just have been growing in the background without her knowing.

Alok Jha
You might have heard stories like barbaras before. In the past five years or so, computer vision tools have been shown to spot everything from fractures on x rays to the early signs of disease, from retinal scans. They have even been used to help with surgeries. That list of uses of AI is set to explode in the coming years. I know I'm a very good radiologist already because the breast screening program shows that we are good, but the AI gives us that little edge in screening.

Gerald Lipp
We look at a lot of normal mammograms. You're looking at six to eight in 1000 mammograms to find cancer. That's 900 mammograms that are normal. So if I can automate and take away some of that 900 mammograms in 1000, that actually gives me more time to be patient. Facing new, bigger AI models promise to make access to healthcare easier, help doctors spot disease faster, and transform how medical research is done.

Alok Jha
But promise is one thing, practice is another. How soon will these technologies really benefit patients? This podcast is the first of two episodes on how the world of healthcare is grappling with the potential of AI. First, well examine where and when patients could benefit from AI models in healthcare. And next week, well gaze into the future.

How will medical researchers and doctors use the latest AI models to understand and treat disease in completely new ways? I'm Alok Jhaar and this is babbage from the Economist today, part one, how AI can improve the experiences of patients.

To talk about the emerging role of artificial intelligence in healthcare is the economist health editor, Natasha Loder. She recently put together a special report on the topic. Thanks for joining me, Natasha. Thank you for having me. Now, Natasha, artificial intelligence and healthcare is a topic that, you know, we've covered quite a lot in the past.

There have been examples of useful places where it's been applied and people have talked about its potential for a while. Just right now, what is the specific focus that makes it interesting for you? Well, I suppose it's generative AI in the last year, that has been particularly creating quite a buzz around what could be possible in medicine using both the AI tools of the past and the ones that we're going to invent. And the possibilities lie beyond not just diagnostics, but also tools for giving personalised health advice. The ability to triage lots of patients very quickly, to write patient notes and summarise meetings with patients, and even to discover new insights from medical data and do drug discovery.

Natasha Loder
So there's a sort of really wide range of possible uses of AI. Okay, so just tell me then, in your opinion, if you were looking at this in the broadest bus strokes, what's the biggest problem that you see healthcare workers being able to tackle with AI sort of in the future? I mean, fundamentally, what AI is able to do is replicate quite a sophisticated form of human knowledge. And so if you think about what are the problems writ large in healthcare, it is that they're just, aren't enough doctors or nurses, psychiatrists. You name a medical profession, there aren't enough of them to service the needs of the seven or so billion people on the planet.

Alok Jha
So that access problem isn't it. That seems to be a really interesting and intriguing way that this might go. If you've got generative AI's that can replace some of the functions of clinical staff, essentially, you just have more people able to access those services. That's what you're saying? Yeah.

Natasha Loder
We're in this really difficult situation at the moment, where globally, our population is getting older, healthcare costs are rising, there's a real burden of chronic diseases, and there's this sort of massive shortage of healthcare workers. And so even in the short to medium term, if we can allow these technologies to be harnessed, there's actually a lot of potential for improving productivity and meeting human need, even if we don't. In the big picture, things solve for health care for the next ten to 20 years. That's the question is how much can we wring out of this technology and how quickly? Okay, well, that's what we'll explore next in the show.

Alok Jha
Natasha will come back to you a bit later on.

One of the tools that's using computer vision to help diagnose patients is called Mia. It's an AI model built by the company Chiron Medical, and it's a system that Gerald Lipp, who we heard from at the start of the show, has been using to detect cancers from mammograms in Scotland. Many people don't necessarily know that. Usually two radiologists are required to look at every single mammogram in breast cancer screening. Peter Kacameti is the chief executive of Chiron.

Peter Kaczykhemeti
You can imagine its large volume, and a large volume can be looked at. Two doctors. Whenever they disagree, a third doctor is brought in. But even with two doctors, it's estimated and expected that more than one in five cancers will be missed. So obviously that means a lot of improvement can be made.

So what can the AI do? Well, it can increase the cancer detection. We have results that show we can increase cancer detection by up to 13%. With that, we can expect that we can actually halve the number of cancers missed in breast screening. Okay, well, talk to me about the technology then.

Alok Jha
What kind of AI model is behind the tool that you're describing? How does it work? Yeah, so MIA is based on a large number of varied AI components. This means convolutional neural networks primarily. Okay, so the input is the scan or mammogram, and the data is passed through layers of the neural network.

Each one looks for specific things in the image that might make up parts of a lesion or some other problem. And once that data has passed through all the layers of the neural network, the algorithm can determine whether or not that patient could have cancer. And Mir passes images through several different networks like this, before it comes to a decision. So, think about a large group of experts, like, I don't know, hundreds of radiologists sitting in a room looking at every single mammogram, and then they're making a decision collectively in order to have the best possible results. And if you think about it, the AI can find information on its own in a data set, for instance, among the breast cancer screening images that humans may not be able to process.

Peter Kaczykhemeti
So maybe it's different motifs, maybe it's finer tune of grayscale, etcetera. And from a practical point of view, this means the AI can have very different capability from the doctors, and that means we need to use the AI and optimize it to be the best complementary capability to the doctors, rather than trying to compete with the doctors. Tell me what the model is trained on. How do you make the neural network able to recognize what a breast cancer looks like on a mammogram? So we used the longitudinal data on the whole patient.

So for negatives, we wanted to make sure that for multiple years, there were no positive signs for the patients. And for positives, we wanted to make sure we account for all the different aspects of a positive cancer. And it is important that we are training the algorithm on as much of the truth as we have access to, like biopsies and other follow ups, rather than what a doctors say, because otherwise we would be just approaching what a doctors can do, and wouldnt be able to develop complementary capability. So were talking about images from mammograms, but also data from biopsies that are linked to those images, perhaps. And all of this is labeled, so that the model knows which ones contain potential tumors and which ones dont.

Yes. And the labeling process is one of the key elements of machine learning, engineering. Really understanding the domain constitutes a positive what's a negative, and to understand when something is certain or not. And typically, how much data is going into it, how many scans, how many biopsies, what sort of order of magnitude are we talking about? The order of magnitude is a couple of million.

It's important for AI that it's not necessarily the number of cases that you need to look at, but you need to optimize for the maximum amount of information that it is trained on. So, for instance, if I have an algorithm that is trained on one particular country in Europe, and if you get more of those cases, that's much less valuable than you get different ethnicities from across the globe, from Asia, from South America, Africa. I see. So you want as diverse a set as possible around the world, diverse in. A specific way, that it gives you the highest information so that you can reduce confounders.

Alok Jha
Okay, so where is Mia currently in clinical use, then? It is in clinical use in Hungary and in the UK. And just talk me through how a radiologist would be using it. So, you know, a patient comes in, they have their scan, there's a mammogram available. Does that go into the model sort of in parallel with the radiologists looking at the same thing?

Peter Kaczykhemeti
Yeah. So once the patient is scanned, the AI will make a determination in a couple of seconds, even before the patient leaves the room. But those results may be held back in order to support an unbiased process. It's a very good idea for the doctors to look at images without the AI's input first. So they don't get to get biased on it.

Exactly. So that they don't get biased, so that the maximum human capability is used. And then the AI then determines whether an adjustment is to be made, for instance, or not. And then it could flag up to the lead radiologist, hey, maybe cancer is missed, or your radiologists are potentially over diagnosing and things like that. It's always a human radiologist who makes the final decision.

They have the authority. The AI is a tool to flag when the AI is certain, and so the doctors have a high chance of changing the patient outcome for the better. Peter. Well, that's been fascinating. Thank you very much for your time.

Thank you so much.

Alok Jha
There's huge potential for computer vision systems in radiology to detect a specific disease from a medical image. But AI's ability to make life easier for patients goes far beyond that. As Jason Hoskin, one of our producers has been finding out.

Jason Hoskin
As you walk down the streets of Boston, it's hard not to be reminded of the city's role as a hub of innovation. Central to scores of bright ideas in biotechnology are two world leading institutions, the Massachusetts Institute of Technology and Harvard University. My interest is in how we build AI doctors. That's Pranav Rajput. He's an assistant professor at Harvard's medical school.

Ryan Reynolds
Five years back, if you were someone working in this space, you would have typically a modality specialty and a disease specialty. So, for example, I was interested back in the day on working on chest x rays, and particularly on chest x rays, to identify science for pneumonia. But the way the field has evolved over time, what's been increasingly clear is that you can start to think about what it takes to not solve one problem, but to solve the full range of problems that a radiologist works on. So now my group works on thinking about interpreting chest x rays and or CT scans and or MRI scans. And we don't think of it from a disease by disease perspective, but really think about it from what we call a generalist medical AI perspective, where you're trying to build these AI models that can do a lot more than just one disease detection.

Jason Hoskin
And so why has the field changed? What's the kind of technology that's enabled that? Well, there have been three key techniques that have really created advancements. One of them is self supervised learning. The idea of self supervised learning is to be able to learn without requiring lots of annotations.

Ryan Reynolds
And we have develop some innovations on that front, along with several other groups in the world. So by annotations, you mean that images are labeled in the training set. So in the breast cancer example, experts would go through and tell the model what a positive case for cancer looks like and what a negative case looks like, and it learns from that. But self supervised learning is making that process less onerous. So that's the first thing that's changed.

Jason Hoskin
What's the next thing? Then the second set of techniques that's been really taking off is multimodal learning. And in multimodal learning, we're interested in not just learning about a single modality from a single modality. For example, I'm not just interested in learning from chest x rays. I might be interested in learning from chest x rays and the associated clinical reports in order to understand, for example, what chest x ray matches, which clinical report.

Ryan Reynolds
Or I might have some associated clinical information, like what the patient history is and how we feed in both the image information and that clinical context information has been something we've just gotten much better at modeling over the last few years. Okay, and what's the final thing? The third bucket is generative AI, to generate language to describe what we see in the image. And this is very close to how radiologists do it. They will write a block of text that goes through comprehensively, every organ, every process, and commenting on it.

And this is a technology that's really been unlocked in the last couple of years and allowed us to make these much more interesting models that can start to look like what we expect from radiologists. And your system incorporates all three of those things? That's correct. So how well does it work? That's a great question.

So, for instance, we published a study on an algorithm called check zero, where you were able to achieve the same performance as radiologists on detecting diseases from chest x rays, but this time using almost no annotations that were explicitly provided by radiologists. One year later, we're at an even better milestone, where recently our group has shown that we can directly output these clinical reports. And when we give those reports to radiologists, and some of the time they come from the model, and some of the time they come from another radiologist, we ask, how often do you prefer the report that was generated by the model? And we're finding out it's about comparable at this point. So I think some of the stuff that you've been saying so far, you're trying to build AI doctors, the algorithms are getting as good as radiologists.

Jason Hoskin
This will cause a bit of fear in people. Right? Are the AI's replacing the doctors? Well, I think there is a conventional approach to thinking about this, which is that these are not going to replace radiologists. And I think that's, for the major part, just wrong.

Ryan Reynolds
Right now, you just have a big supply shortage. So in India, the volume has gotten so high, chest x rays are not even reported by radiologists. That means you have someone who is frankly not comfortable reading chest x rays a lot of the time, and they're being pressured to be able to make these decisions, which have big consequences downstream for patients. So now think about bringing AI into the picture. If you had autonomous radiologists and you deployed those to be able to provide either preliminary reads or service in areas where there weren't experts to be able to do the interpretation, just imagine how much positive benefit that has to society.

We're just not in a situation where there's an alternative. Training of radiologists globally is at a standstill. What? You have an exponential curve in is the rise in the imaging demand of an aging population. This is a 5% compound annual growth rate.

And so you think about this problem just getting worse and worse and worse. And the solution that we have in the horizon is going to be AI models that can do these reads. I can see that in areas where it's difficult to access care, an autonomous radiologist would be really useful. But people would definitely be freaked out by the idea of that, particularly when you're thinking about things like trust. You know, I often ask this in seminars I give.

I ask people, how many of you trust your doctor? And it's surprising that not a lot of hands actually go up. And it's interesting to ask, why do we trust doctors? Well, one is they have a certification that says, I'm qualified to be able to be a radiologist, and they have to renew it every few years. And so that is a system we have set up in a community that understands how we think about the licensing of human doctors.

So, similarly, I think we need principles for licensing these AI medical doctors. Okay, now, all of this futuristic technology that you've talked about relies on generative AI. And I think most people's interactions with generative AI will have been playing around with the likes of chat GPT. You know, maybe people have used them to check out symptoms that they might have, but chat GPT obviously isn't licensed as a medical device. Is that sort of general technology useful, though?

So I was recently trying to find out what I can do for some back pain that I had, and when I looked at Google versus what I was able to understand and collect from chat GPT, there is a vast difference in terms of the depth of the information that it can provide you. Now, of course, that comes with the question, well, how will I validate what it's telling me is true? Well, I can always look up on independent sources what those independent sources have to say about what chat GPT thinks about my medical concern. But this has actually not been the exception as much as I thought it would have been. Whether we like it or not, we have been, at least on the patient side, using these systems.

The caveat here is also that these big companies typically don't want to be making medical recommendations, and so they tune their model in order to be able to refrain from making medical recommendations. And that's probably a good thing. From a regulatory standpoint, then they don't have to worry about being classified as a medical device. But I think it does come with a trade off where an early version of chat GPT was actually evaluated in the New England Journal of Medicine for its ability to answer medical questions, and turns out it actually did pretty well. That sort of gives you an insight into that.

These systems are powerful and frankly, there are a lot of places where having such tools is going to make an incredible difference to the quality of healthcare that can be received. That's fascinating. Pranav, thank you so much for talking to me. Thanks for having me.

Alok Jha
You can see why large language models such as the one that powers the likes of chat GPT could help to understand an illness better. After all, they've ingested pretty much all of the data available on the Internet and try to make sense of it. That includes factual websites provided by hospitals or healthcare systems, articles in medical journals, and even first hand accounts posted online by doctors or patients. That ginormous training set might also include inaccuracies or misinformation, though. Whats more, the way that generative AI produces its results makes it hard to fully trust.

But can those challenges be overcome? Could the conversational powers of chat GPT be harnessed to create reliable medical bots? Thats coming up.

Its that time of the year. Your vacation is coming up. You can already hear the beach waves, feel the warm breeze, relax and think about work. You really, really want it all to work out while you're away. Monday.com gives you and the team that peace of mind.

When all work is on one platform and everyone's in sync, things just flow wherever you are. Tap the banner to go to Monday.com.

today on Babbage, we're asking how medical AI tools can improve the experiences of patients. Natasha Loder, the Economist's health editor, who we heard from earlier, has been looking into the use of chatbots in healthcare. Natasha, we just heard that chat GPT can in fact be a useful tool for medical advice. You're sitting here in the studio with me, so let's test chat GPT out for some medical problems that you know you might be having. I have to say I'm skeptical, but you've got chat GPT on your phone there, and the latest update means that it speaks now.

So why don't you give it some symptoms and see if it can work out what it thinks is wrong with you. Okay. All right. Hi, I'm having issues right now, could not understand what you said. Hi, can you help me with something?

ChatGPT
Of course. What do you need help with? Well, I've got a medical problem that I'm worried about. I've been quite short of breath today and I have a sharp pain in my chest, just slightly to the left. It sounds like you're experiencing symptoms that could be serious.

It's really important to seek immediate medical attention for symptoms like shortness of breath and sharp chest pain. Can you call a healthcare provider or go to an emergency room? Yeah, I could do that. That sounds like a good plan. It's important to get checked out by a professional as soon as possible with symptoms like those.

Alok Jha
Ask chap GPT one. Yeah, he or actually thinks is wrong with you. What do you actually think is wrong with me, though? I kind of need to know. I'm not a doctor, so I can't diagnose what's wrong.

ChatGPT
But symptoms like shortness of breath and sharp chest pain could be caused by a variety of things, from something less serious, like muscle strain, to more urgent issues like a heart problem or a lung issue. What kind of heart problem could it be? Chest pain and shortness of breath can be associated with several heart related issues. Here are a few possibilities. One, heart attack.

This occurs when blood flow to a part of the heart is blocked for a long enough time, that part of the heart dies. Two, angina. This is a symptom of coronary artery disease and occurs when the heart, jaw or back. Three, pericarditis. This is an inflammation of the lining around the heart to determine the exact cause and appropriate treatment.

Alok Jha
Now, that's interesting, because there are so many guardrails out there with that version of chat GPT that you're using, that clearly, it doesn't want to tell you exactly what it's doing, and it's saying it's not a doctor, and it's saying you should get medical advice, which is probably sensible. Right? That's exactly the right thing to do. But it took several minutes there before it suggested that you might be having a heart attack, which, I mean, it's not funny. You'd be lying on the floor at that point, incapable of moving.

How useful is that, really, in medical advice? Well, it's clearly not intended to be useful and not that version. No. But, I mean, we're a long way technologically from being in a situation where we could recommend that anyone having these symptoms could just go and safely have a conversation with the chatbot, and that would be the best course of action. I mean, it is good advice, as far as I can tell.

Correct. I'm not a doctor, but if you're going to dispense diagnoses, there are rules for how you do this, and there are tests that you have to do and you know, there are regulations you have to follow, and that's not what this particular bit of technology has been through. I think that it's unfair to just expect the publicly available GPT model to be giving us medical advice. I think it's not terrible. I mean, look, I think the reason this is a good test is that we know that people have been googling Doctor Google, as they call it, exactly, for years, and we now know that people are not only chat Gpting, if that's the way of putting it, their symptoms.

Natasha Loder
We know that doctors are doing it, and we also know that people are solving medical mysteries using these tools. But despite the promise, then can you explain to me why a generalized model of the type that powers chat GPT isn't ever going to be a medical device? The missing bit of information we don't have is we don't know the number of people who are using these tools and getting the wrong answer, or getting false reassurance that they're fine when they're not. And that's the kind of problem that we haven't even begun to solve with these tools. And that is why we can't rely on them.

Also, because they're generative, they come up with a different answer each time you ask them. So, you know, who knows, after the 10th time of asking that question, whether it would give me the same accurate advice? And of course, you know, we've talked about hallucinations before. There's a generative AI's hallucinate. They make facts up.

Alok Jha
It's hard to understand how they come to their conclusions sometimes. So these are all problems for trust. And of course, trust is probably the most important thing when it comes to medicine, because you need to understand how whoever's treating you, giving you advice, how they come to their conclusions, and what happens if things goes wrong as well. So I suppose the next question is, could you make a more specialized healthcare chatbot? And are people trying to do these things?

And what benefits would they offer? Well, look, you can get medical chatbots. There's one called Ada health, which is a text based chatbot. So you type in your symptoms. It's a symptom checker.

Natasha Loder
It doesn't promise diagnoses, but it's been crafted over many years and had all the sort of data that it needs built into it, so that it can essentially assess what you type in and ask questions, and then kind of weigh up what the symptoms could possibly be, and then give you a range of likely answers. And that's a regulated medical device. So we have that. The problem, of course, is that with these kind of chatbot is they're a bit clunky. You have to go through a question and answer tree.

It can be quite a lengthy process. You may have a cold, and you may have to sit down answering lots of very different questions, some of which seem quite strange. But so the question is, can we bridge the two systems? Can we bridge this accurate, text based chatbot with this amazing tool that is conversational chat GPT? Can we sort of have an interface that interrogates this much more accurate medical chatbot?

And that is very much what people are thinking. And so the next step in the evolution could be where you have a conversation with something that sounds like chat GPT, but kind of under the hood. You have a kind of much more solid, robust, explainable and provable AI chatbot. I mean, there are ways of making sure that the outputs of a large language model are more accurate, and they're not inventing information. I mean, this is a problem across all sectors.

Alok Jha
And the solutions I've heard of are things like only allowing the large language model to access data from a particular database. So you've picked papers or a database of medical information. That is correct. And you ask the large language model a question, but say to it, only find your answer from that particular database. You know, it's called retrieval augmented generation.

And then another way of improving the responses from a large language model is to take a generalized large language model and fine tune it with specific labeled information that is very highly accurate, let's say medical information, which means that when you ask it a question, it's not a generalized LLM anymore. It's got specific talents, knowledge about certain conditions, so it's less likely to give you bad information. You know, ultimately, this is all going to have to go through regulators. They're going to have to have some oversight over whatever people come up with. But I think there's a long way to go on the technological front before we can say, actually, no, this is too difficult, and it can't be done.

So, the AI of today does seem to have a lot of promise. But what's going on in clinics right now? I said at the start of the show that we'd look at the experiences of patients in the age of AI, but I highly doubt that what we've discussed so far is your experience when you visit your doctor or go to a hospital. It certainly isn't mine. Producer Jason Hoskin has been investigating the.

Hugh Harvey
Reason why, if you develop an AI algorithm that can triage something or provide an indicative diagnosis, then the minimum thing that you have to do is run it by the regulators and get a regulatory clearance to allow you to put that device on market. Hugh Harvey used to be the clinical director at Chiron Medical, the company that we heard from earlier that uses AI to detect breast cancer. Chiron was the first AI system ever to get regulatory approval, so I did that. It was the first european clearance for a deep learning computer vision system. Yeah.

Jason Hoskin
Hugh now runs a consultancy called Hajin Health. They help other AI companies comply with medical device regulations. It's a step that is slightly onerous on first timers in industry, because you have to develop a quality management system. You have to show that any software used for a medical purpose is safe, effective, and cyber secure, which I'm sure patients would agree are the minimum things they'd want to see from such devices. And how do they do that?

How do they show that they're safe for all these things? So you start off with your hypothesis that your device can do x functionality. You then do a literature review to determine how well that is currently done under standard of care within the hospital system. You then develop your device in a documented process to show that you've identified risks and you've mitigated against those, and to show that you have met the user requirements as you've identified. And then you run some form of clinical investigation to show that your device can do what you say it does, and you present that data to a regulator.

So the tools that are being deployed in clinical settings right now, they're quite narrow and rigid in terms of what they can do. But the new models, they're more generalist. And how much more of a challenge are generative AI models that, by definition, give a unique output every time? Are they more of a bottleneck and. A challenge just to lay the ground?

Hugh Harvey
The background context here is that not a single large language model has received regulatory approval for use for a medical purpose. And as you say, this can be due to the fact that these systems have a propensity to hallucinate, which to all our best efforts, we can't control to a certain extent. And they're non deterministic. And what that means is that if you give a large language model a certain prompt or a certain input, you will not get the same output every single time. And that is something that the regulators so far have not been happy to give regulatory clearance for.

Its accuracy as well. It's very, very difficult to measure the accuracy of a general system. If you have something that has potentially infinite inputs and infinite outputs. We don't know how often it's going to get it wrong. And it's not just how often it gets it wrong, it's how badly it gets something wrong.

And until we can measure that, these things aren't safe to be on market as an intended medical device. It is quite hard to conceptualize an academic framework by which you would evaluate that across its entire generalizability, and I've not seen anyone do that comprehensively yet. Do you know of large language models and generative AI's that want to get regulatory approval? There are many that want to get regulatory approval, absolutely. I had three inbound emails this morning with people asking, how can we get regulatory approval?

I was actually at a conference last week with a bunch of regulators, and it was almost unanimously decided that current regulatory frameworks and the current evidence being generated by vendors is not sufficient to allow any of these to be used, especially in high risk situations. You know, if they have a generalist model, you are claiming it can solve pretty much any medical problem. Well, that includes children with cancer or neonatal sepsis, which we all agree is a seriously high risk situation, and we should be focusing on more low risk use cases, perhaps things in the administrative realm, rather than trying to use it directly into the diagnostic realm. Do you think this is something that needs to be overhauled, though? Because clearly they have great potential in medical settings?

No, I would put it the other way. I would actually say the developers of large language models need to actually improve their evidence generation. If you want to make great claims, you need to provide exceptional evidence. And if we are going to, you know, unlock this apparent revolution in healthcare, surely then someone needs to demonstrate these things can be safe, effective and cyber secure, rather than say to the regulators, well, could you just lower the barrier? Another point I'd like to make is that, you know, if someone invented a drug tomorrow, it would take them ten years to get it to market.

And I don't see any reason why we should be rushing large language models into hospitals, ignoring evidence, atypical timeframes it takes to generate it. That's quite a useful comparison. Something else that's been put to us on the show is perhaps the regulatory and policy framework for AI should be something equivalent to licensing doctors. Do you think that is maybe a good way of thinking about it? I don't, actually.

I would disagree with that. I think comparing large language models to fully functioning human doctors is anthropomorphising the technology, which I think we should really stay away from, I think it's rather dangerous to assume this technology is as capable as a human when it's not. Yes, large language models are broader in capability than some of the more linear models that came before it, but it's by no means at the intellectual level of a human. And so to say, well, we should use the same kind of regulatory structure that we evaluate humans on is, I think, a dangerous thing to do. Large language models don't have moral accountability and they don't have, you know, decades of experience like doctors do.

Then again, if we were going to regulate it like we do doctors, well, it takes five to six years for a doctor to go through medical school, then it takes them, I don't know, seven to ten years training to become a consultant. So, okay, if you want to do that for large language models, let's wait ten to 13 years and see your evidence at the end of it. And, look, I'm coming across as quite pessimistic, and I realize that, but I think we'd all agree that actually what we want to do is take our time and get the safest, most effective products to market, and that that is the reason why we have these regulations in the first place. Hugh, thank you very much. Thank you.

Alok Jha
Despite the challenges around implementing generative AI systems, such as large language models into healthcare, more of the narrower, more specific tools are getting into clinics like Mia, the cancer detection model we explored earlier. So what does all that mean for patients in the future? To round off this episode, I'm once again joined by the economist Natasha Loder. Natasha, we've talked about diagnosis and advice and things. Where else could chatbots and these sorts of artificial intelligence models be useful in looking after patients then, as they're more integrated with a health system?

Natasha Loder
So really, it depends on your time scale as to sort of what the answer is. Ultimately, you know, they could be a sort of first line service 24/7 giving advice, triage, even diagnostics. But really we're going to start with low risk activities such as advice and symptom checking. Where you will see AI's more advanced, AI's more ambitious, AI's appear earlier are in places where you have no care, rural communities, developing countries, particularly, where you're supporting patients that have chronic health needs, in countries with very few doctors, and in those sorts of settings, the risk of having some sort of AI system supporting patients has to be set aside. The fact that they're getting none at all, that's how healthcare often views any kind of innovation.

It's what are the risks, what are the benefits? And so in countries that have no healthcare or little healthcare available for certain populations, it may be that AI tools are really going to be a sort of quick answer. And then in the longer term for you and me, perhaps we'll be using these as personal doctors. They'll have our own personal medical information and be able to give us quite sophisticated answers to questions that we might have. Yeah, so they'll be able to take your many years of medical information, blood tests, anything else, any drugs you might have had, and give you quite specific advice in a way that you just can't do without a human doctor right now.

Alok Jha
And even then, it's limited to whatever that doctor specifically has available in terms of information or memory, in terms of how they treated you. I mean, this is a classic problem. You can go and see your doctor and they may have a few moments to scan your recent medical history. They're not going to go back through years of notes. And yet the information in there could be highly salient to the problem you're having today.

Natasha Loder
And so AI's have the potential to make medicine much more sophisticated, to reduce the amount of error, the misdiagnoses. It's extraordinary to me how we could really start to meet the needs of so many more people with this technology. Natasha, we've covered a lot of ground in this episode and looked at different ways that different types of AI models could be used in healthcare. You've done a lot of reporting on this just recently, and I'm just curious where you land on this. You know, we don't want to get too excited about things just because new technology arrives.

Alok Jha
But then there's so much potential for, as you say, more access to healthcare, better healthcare. Give me a scale of how excited I should be about all of this. Well, it's all about time scale and looking forward in the next 1020 years. It's transformative, it's going to revolutionize diagnostics, treatment, patient care, almost every aspect I would think of healthcare, including foundational models which are going to essentially absorb biology and medicine and make it more understandable and explainable. But in the short to medium term, it's less clear to me how rapidly these technologies will disseminate.

Natasha Loder
And that's because there are a whole bunch of kind of ethical, regulatory and technical challenges that need to be overcome to ensure that it's deployed safely. And that's going to require regulators to actually spend some more money. They need to move more quickly, they need to put more effort into oversight of these technologies because if we're going to introduce them safely, that's what it's going to take. But yeah, I'm very excited. I can't wait to see what's coming down the road.

Alok Jha
I think that you've managed to make that answer a very sensible one. Because isn't it that loads of AI people always talk about how these sorts of technologies are overestimated in the short term and underestimated in the long term? And in case anyone doesn't know what a foundation model is, we'll be talking about that in the next episode. This two part series on AI in healthcare one of the most important and interesting uses of AI in healthcare. And I look forward to speaking to you again next week to look at how all of this is going to change the medical research side of things.

Natasha Loder
Sounds good.

Alok Jha
The story of AI in healthcare doesn't end there. To catch the second part of this series of Babbage and much, much more, you'll need to become a subscriber to Economist podcast. Plus, just follow the link in the show notes. Thanks to Gerald Lipp, Peter Kaczykhemeti, Pranav Rajpukar, Hugh Harvey and the economist Natasha Loder, you can read Natasha's full ten page report on the promise of AI in healthcare on the Economist's app. It's really worth it to get into the details of her reporting.

There's a link in the show notes. Thank you for listening. Babbage this week was produced produced by Jason Hoskin and Lawrence Knight. Mixing and sound design is by Nico Rofast. The executive producer is Hannah Mourinho.

I'm Alok Jhar and in London this is the economist.

Stripe
My business used to be weighed down by the complexities of in person payments, then tap to pay on iPhone and stripe came along and changed everything. With tap to pay on iPhone and stripe I streamlined my payment process effortlessly. Now I can accept in person contactless payments right from my iPhone, no extra hardware required. Whats truly remarkable is how I can cater to all of my customers payment preferences, whether theyre using cards, Apple Pay or other digital wallets. Tap to pay on iPhone and stripe ensure a smooth checkout experience every time.

And its not just me. Stripe helps businesses of all sizes from local markets to global retailers, scale quickly and stay agile. To learn how tap to pay on iPhone and stripe can help grow your revenue and reach, visit stripe.com tapiphone.

AI and health part one: DrGPT will see you now

Primary Topic

Episode Summary

Main Takeaways

Episode Chapters

1. Introduction

2. AI in Practice

3. The Future of AI in Healthcare

4. Ethical and Regulatory Considerations

Actionable Advice

About This Episode

People

Companies

Books

Guest Name(s):

Content Warnings:

Transcript