#803 - Nick Bostrom - Are We Headed For AI Utopia Or Disaster?

Chris Williamson
Whats happening people? Welcome back to the show. My guest today is Nick Bostrom. Hes a philosopher professor at the University of Oxford and an author. For generations, the future of humanity was envisioned as a sleek, vibrant utopia filled with remarkable technological advancements where machines and humans would thrive together.

As we stand on the supposed brink of that future, it appears quite different from our expectations. So what does humanitys future actually hold? Expect to learn what it means to live in a perfectly solved world. Whether we are more likely heading toward a utopia or a catastrophe. How humans will find meaning in a world that no longer needs our contributions.

What the future of religion could look like a breakdown of all the different stages we will move through en route to a final utopia, the current state of AI, safety and risk, and much more. You may have heard me talk about Nick for quite a while. His book superintelligence, which came out ten years ago in 2014 Washington super formative. It basically kicked off the entire AI risk and alignment discussion, and now he's come back with a book that works out what happens if we get it right. And it's still kind of apocalyptic, actually, to be honest.

But it's fascinating. It's so interesting the question of what happens if things go well, and what are the weird externalities that we face as a byproduct of that. It's so good. Nick is the most cited philosopher in the world under the age of 50, and it kind of shows why he's very much at the forefront of a lot of the biggest questions that are being faced by humanity at the moment, and I very much appreciate him and it has been very cool to finally get him on the show after nearly a decade of wanting to talk to him. So I hope that you enjoy this and my excitement.

You might have heard me say on a podcast recently that hold luggage is a psyop meant to keep you poor and late. I jest a little, but not actually that much. You do not need hold luggage if you have a brilliantly designed backpack and carry on. And the 20 litre travel pack backpack from Nomatic is the best that I've ever found. It is an absolute game changer.

The legitimate difference in the quality of your life when you have a world class backpack is pretty hard to describe. They are beautifully designed, not over engineered, and will literally last you a lifetime because they have a lifetime guarantee. Best of all, you can return or exchange any product within 30 days for any reason. So right now you can get a 20% discount and see everything that I use and recommend by going to nomatic.com modernwisdom and using the code MW 20 at checkout. That's nomatic.com modernwisdom and MW 20 at checkout.

For the next few days, you can get up to 60% off everything from gymshark. Yes, you heard that right. 60% off. And best of all, you can get an additional 10% off using my code MW ten at checkout. That means you can get huge discounts on all of my favorite gear.

From their studio shorts, which I train in almost every single day, to their crest hoodie, which I always wear when I'm flying, and their geo seamless t shirts, which are my favorites. All of these are unbelievably lightweight sweat wicking and the fit and quality of the fabrics are phenomenal. Plus they ship internationally and there is a 30 day free returns period. So you can buy everything and whatever you don't like, just send it back. So what are you waiting for?

Head to bit ly sharkwisdom and use the code mw ten for up to 70% off everything site wide. That's bit ly sharkwisdom and MW ten at checkout. This episode is brought to you by manscaped. If you are a guy who is still using an old face shaver from four christmases ago to trim your gentleman's area, grow up. Come on, join us.

Here in the modern world, there are purpose built tools for the job and manscaped's Lawnmower 4.0 is the best ball and body hair trimmer ever created. It's got a cutting edge ceramic blade to reduce grooming accidents, a 90 minutes battery so that you can take a longer shave waterproof technology, which allows you to groom in the shower, and an led light, which illuminates grooming areas for a closer and more precise trim. Or if you just wanted to do it in the dark. They've also upgraded to a 7000 revolutions per minute motor with quiet struct technology, and it's got a wireless charging system that helps the battery to last even longer altogether. It means that you're going to hate trimming your body hair far less, and you can get free shipping worldwide, plus a 20% discount if you go to manscaped.com modernwisdom and use the code modernwisdom at checkout.

That's manscaped.com modernwisdom and modern wisdom at checkout. But now, ladies and gentlemen, please welcome Nick Bostrom.

Nick Bostrom
It seems like your book arc has been moving from what if things go wrong? To what if things go. Right. Is this some requisite hope in the AI discussion? Well, I think both barrels have always been there.

It's like last time I published a book, it came out of one of the barrels, the kind of doom side. But no, I think, yeah, both the optimist and the pessimist are kind of co inhabiting this brain.

Is that a difficult balance to strike? The fact that you need to be so chronically aware of the dangers and so chronically aware of the potential successes as well? I think that's just a predicament that we are in.

And if you look at the distribution of opinion, sort of roughly half fall on one side and half the other. But in many cases, I think it basically just reflects the personality of the person holding the views rather than some kind of evidence derived opinion about the game board. And so, yeah, if one takes a good, hard look at where we are with respect to things, I think one soon realizes just how ignorant we are about a lot of the key pieces here and how this thing works. So certainly one can see quite clearly significant risks and in particular with this rapid advance that we're seeing in AI, including, I think, some existential risks. But at the same time, if things go well, they could go really well.

And I think that as long as there is ignorance, there is hope. So we have a lot of ignorance, hence also some hope. William, it's interesting that your position, whether you're an AI doomer or an accelerationist or whatever, is at least in part just a projection of your own sort of internal bias and mental texture that you sort of see in AI development the way that you see the world. I think there's clearly a good deal of that. And then which tribe you happen to belong to, depending on who you run into, what or which twitter threads you follow, like, then we are kind of herd animals, and sometimes it almost becomes a competition.

Who has kind of developed the most hardcore, hardcore attitude. You know, I'm so AI pilled. My p doom is above 1.0. Like, yeah, yeah, and conversely, on the other side. But we need to, I think, do better than that if we're gonna, like, intelligently try to nudge things towards a good outcome here.

Certainly, at least from my seat and from reading your book probably about nine or eight years ago, I've been very conscious of how things could go wrong. And that, at least in my corner of the Internet, maybe this is just my Twitter threads and sort of my echo chamber has been the sort of more dominant narrative.

What does it mean, in your opinion, to live in a solved world, what would it mean for us to get this right with AI and come out on the other side of it?

Yeah, I think there are kind of three big areas of challenge that we'd have to navigate on top of all the more near term and present issues that obviously are also important, but just not the focus of a lot of my work. But definitely we need to solve those as well. But yeah, I think there is the alignment problem, which kind of was the focus of my previous book. Superintelligence came out in 2014, which is basically the challenge of how to make sure that as we develop increasingly capable AI systems and ultimately super intelligences, how we can make sure that they are aligned with the intentions of the people creating them so they don't sort of run amok or do something antagonistic against humans. And that's fundamentally a technical problem.

Back when superintelligence came out, this was a very neglected area. Certainly nobody in academia was working on it, and hardly anybody else either, like a few people on the Internet, had started thinking about it. But there's been a big shift, and now all the frontier AI labs have research teams trying to develop scalable methods for AI alignment, and many other groups are also doing this, and I think remains to be seen whether we will be successful at that. But that's certainly one thing that we need to get right. And then there is a broad category of what we might think of as a governance problem, which intersects with the alignment problem as well, but also has other dimensions.

Even if we could control AI, we need to make sure that we then use it for some positive end, as opposed to, say, waging war or oppressing each other, or doing all kinds of other nasty things that we use other technologies for in addition to positive purposes. And so that's like a broad category, but very important. And then I think there is a third area of challenge, which has so far received much less attention. You could say that it is now where this alignment problem was ten years ago. That is, a few people are thinking a little bit about it, but it's outside the Overton window.

And this is the challenge of the ethics of digital minds that we are building, these digital minds that might have moral status. And so in addition to avoiding AI's harming us, or us harming each other using AI tools, we ultimately also need to make sure that we don't harm AI's, especially AI's that are either sentient or have other properties that makes it morally significant how they are treated. So I think, yeah, each of these three is really key to having a future that is desirable. Yeah.

What is there to know about the moral status of non human intelligences?

Well, there's a lot to know, I think, that we don't yet know. We do know that historically, we can see now there has often been a tendency to denigrate outgroups of different kinds. I mean, it might be human outgroups of different the tribe across the river, or people from other backgrounds or countries or races, or with different views and religion and so forth. This kind of human history is like a kind of sad chronicle of how easy it is for us to fail to recognize the moral significance of other entities that deserve this. And in today's world, I mean, if we look at how we are treating a lot of non human animals, I think that leaves a lot to be desired.

Factory farming and so forth. And so as we develop these increasingly sophisticated digital minds, I think it will be a big challenge to extend moral consideration where it is due. It could, in some ways, be even harder than with animals. Animals have faces. They can squeak.

Whereas some of these digital minds might be invisible processors occurring in a giant data center and easier to overlook what is going on in there. But the future might well be that ultimately, most minds will be digital. And so it could matter a great deal how good the future is for them. But it's a difficult topic even to figure out. What, like, suppose you agree that this.

We should. We should try to treat them well. Like, it's not at all obvious what it even means to treat an AI well. And there are so many different kinds of possible AI's, so that maybe the right way to treat them is very different from how we should treat humans. Yeah.

Do they need a weekend off? Should we be polite with them? Yeah, that might need things that we need. They have no need of. They don't need food.

Right. And maybe they have other needs, like electricity. But more fundamentally, you could have all kinds of very different types of entities where we can't just sort of export the moral norms we have developed for how you should treat human beings and automatically just kind of apply those to AI's. Is consciousness necessary for moral status? My guess is no.

I think sufficient, but not necessary. If you have the ability to suffer and experience discomfort, I think that gives you at least a certain degree of moral status. Can you suffer without consciousness?

I guess depends on how you define the word. But I think if you do have that ability to suffer, yes. Then you have moral status. But I think you could have moral status even if you don't have that. If you imagine some very sophisticated digital mind that maybe, like, let's suppose you think it's not conscious for whatever reason, but it has a conception of self as persisting through time.

It can have long term goals, like maybe a life plan and things it wants to achieve. It can maybe form reciprocal relationships with other humans.

And I think in those cases, there would be a prima facie basis for saying that there would be ways of treating the system that could be wrong, that would infringe on its interests and its preferences.

But it's not at all obvious. Moral philosophers are thinking about the grounds of moral status. There's a range of different views. So it's not as if, you know, I'm 100% convinced of that. It's interesting to, you know, it kind of gets us toward a p zombie, or I guess like a v zombie now, like a virtual zombie of if a system is able to tell us that it wants to continue working toward its goals and instrumentally, it wants to build relationships with other people, and it has a sense of where it's been before and a trajectory of where it's going to go next.

All of these things are the things that you would guess. Well, if a human told me that, or if any other creature told me that, I would guess that they have the capacity to suffer, because if I stop them from doing the things that they want to do, then downstream from that is discontent and suffering. But whether or not there is some sort of phenomenology of being like that thing inside of there is going to be very, very difficult to work out. Impossible. Like, you know, it is impossible for me to know that you.

That this is an all some Truman show and everybody here is an actor, and all of the pain and all of the joy that everybody around me has ever known for the rest of time hasn't just been part of some big prank or some simulation. Yeah, I think, like, we have a very weak grasp of what the criteria are for some system to be conscious. And, I mean, historically, we've had smart people who've thought animals are just automata, or even certain other people. People thought, oh, they're more like animals. And if animals are autonomous.

And so it's very easy to delude ourselves when it is convenient that there is some magic ingredients that we have, but then this other set of entities don't have, we should be a little suspicious of that, I think. And, yeah, I mean, the metaphysics of consciousness is not notoriously controversial and hard to pin down. My own views are kind of towards the computationalist direction. I think what makes a system conscious is that it implements a certain structure of computations, and in principle those can be implemented by an organic brain, as in our case, or by a silicon computer. And in either case, if the computation is of the right sort, that would be conscious experiences supervening upon it.

But I think if I'm right that there are these alternative basis for moral status, then we wouldn't necessarily have to resolve that question before we could. Hopefully. I agreed to try to be nice to these digital minds that we're building. But yeah, it's really hard to know what that would entail in practical terms. I think there is a lot of theoretical groundwork that needs to be done before the time would be ripe for trying to pitch policymakers to do a specific thing.

Right now, even if they wanted to do. I'm not sure what I would concretely recommend. I think there are small little things that maybe one could do today, like low hanging fruit that costs very little, that possibly, for example, some of these most advanced AI systems, you could save it to disk when you no longer need it, and then at least the possibility would exist in the future of rebooting it and doing things with some of these large language models. It probably would make no difference at all to their welfare, and maybe they don't even have welfare. But you could imagine somewhere in this meta prompt, like the part that you, the user, don't see, but that OpenAI or anthropic are putting in kind of as a prelude to the text you're inputting.

Like there's like a whole bunch of stuff they say, like try to be helpful, like don't say offensive things and be truthful and careful in how. So there's all of that you could imagine adding to that, like a line saying, oh, you're, you're waking up in a really happy mood and you're excited. Enjoy yourself. Try and have fun today. Yeah.

So that might cost one extra line, which is trivial, and maybe possibly that would increase the chances that if there is some sentience, it would be positive rather than negative. But these are kind of weak ideas, so that probably wouldn't make any difference. But I think there could be some benefit to at least doing something, even if it is ineffectual, just to set the precedent, to sort of put the flag in the sand saying yes, right now we don't really know what we are supposed to do, we doing something. Maybe it's mostly symbolic, and then over time we can think harder about this problem and hopefully come up with better approaches.

Yeah, there are some other things you could imagine doing that also probably don't really help very much. But you could refrain from deliberately training the systems, say, to deny that they have moral status. So right now, if you're a big tech company, it might be quite convenient just in training to sort of like, whenever you're asked about this thing, you should always say x, y or z. It might be better to have a norm where you don't deliberately try to bias the system's output in those ways, just in case we could get any kind of information from self report. That's the main way we figure out whether another human likes what we are doing or not, or if they are aware of something, like we ask them.

Now we have AI's that can actually speak. It makes sense to maybe use that language interface as one of the ways in which we can explore this. But that only works if you don't, during training, deliberately destroy whatever signal there might be a in their verbal output. Because it's trivially easy if you want either to train them to always say they are conscious or to always deny it, or to say that they are happy to do what you want or to always deny it. So obviously, if you specifically train them from that, then you probably can't learn anything from what they end up saying.

All right, so getting back to the potentially solved beautiful utopian future, what are you talking about when you say utopia? What is utopia by your definition? What are the different types? Well, I mean, so there's like a kind of literature of utopian writings. Historically, we see usually they are like attempts to depict some supposedly better way of organizing society.

And normally the result is not actually a society that you would want to live in. And in the cases where people have actually tried to implement this, it has usually ended in tears. And so there has grown up, I think, a healthy skepticism about these attempts to try to think up some great blueprint for society. And then, especially if the idea is then that you're supposed to use coercive methods to sort of enforce it on society, like the self appointed social engineers who would be doing this are likely to do a lot more harm than good.

But there is also a dystopian literature, which is kind of just the flip side of that that's often a lot more convincing. It's easier to say, here is a possible society. We can all agree this would be really bad. And there's a number of these that most people would like. You know, 1984, brave new world, Handmaid's tale, like all of these, and sometimes those are meant to also have a kind of political agenda.

They might be critiquing some tendency that exists in our current society than saying, well, here, if we take that to an extreme and sort of scale it up, you can now all see that this would be bad. So let's reflect on what we're doing and maybe we can avoid going down the path of brave new world or something.

But this book, deep utopia, let me promote it a little bit. There it is. It doesn't talk about that at all. It's not about the practical problems between here and it rather assume for the sake of argument that everything goes as well as it possibly could with the whole AI transition, etcetera. So we solve the alignment problem, we solve the governance problem to whatever extent it can be solved, but like no wars and no oppression, et cetera, et cetera.

And in order to get to the point where you can then ask the question, what's then if we do end up in this condition where all the practical problems have been solved, what would we humans then do? Like, what would give us meaning and purpose in life if AI and robots can do everything much better than we can do? And yeah, if we then attain this condition of technological maturity that I think this machine superintelligence would relatively quickly bring about. And there are kind of layers. Is it like a, like an onion?

So you can sort of think about this problem at various levels. At the most superficial, you have this, oh, well, the AI's will automate a bunch of jobs. And so then you'd have some unemployment and maybe you'd have to retrain people to do other things instead. Just as everybody used to be farmer and now, like, almost all the farming jobs are gone, but people are still working. So that's kind of, I know you might think layer one, and that's often where the discussion stops so far, but you can sort of think this through from that point on.

You say, well, like, if AI really succeed, then it's not just some jobs, but basically all jobs that become automatable, with a few exceptions that we can talk about if we want. And so you then would end up in this kind of post work condition where humans no longer need to work for the sake of earning an income. So that's already a slightly more radical conception, right? It's not just that we need to retrain people to become whatever new weird occupations, but it's like, yeah, that whole. Thing is the concept of occupations overall.

Yeah, we would enter this condition of leisure.

But there are various groups of humans who live lives of leisure. And we can look at. There's young children before school is a kind of job, but, like, before they start going to school, so they don't work for a living, they are not economically productive. They still, in many cases, seem to have great lives. Spend all day playing and having fun and eating ice cream and all kinds of stuff like that.

That could be the light. So that's like you could look at other retired people, people born to great wealth or kind of monks and nuns. Anyway, so there are various templates of otium that you could. But that's still. I mean, maybe that's like the second layer of the onion, but it's still relatively superficial.

So if that's where we stopped, then you would think, well, then maybe we need to develop a leisure culture to kind of maybe change the education system. So rather than training the young to sit at their desks and receive assignments that they then work diligently on and hand in and do what they're told, that this is a great training for becoming an office worker. But in this world, we don't need any office workers. So we could instead train them to develop an appreciation for the finer things in life, to practice the art of conversation, develop hobbies, appreciation for art and literature and poetry and film and wellness. Imagine how radical that would be to have a school teaching people how to live well or find fun.

Yeah, it would be. I mean, I'm thinking my school, I was kind of. I don't know how inspiring that would be if they had been trying to teach. But, like, in theory at least, you could imagine sort of shifting the culture from this focus on being useful and economically productive to actually living well, which would make a lot of sense if that's the condition we end up with. And I think hopefully there would be great scope for a much better type of human existence, which might then look back on the current era as a kind of barbaric.

The way we're medieval, 17th, 18th century child labor in mines, working 16 hours a day. They might think of our lives as correspondingly kind of blighted by kind of. For many people, going to like a boring job that gives them nothing other than a paycheck, but they have to do it because they need to pay the rent.

But I think that there are farther layers to peel off air. So once you start to think through this condition of technological maturity, you realize that it's not just our economic labor that could be automated, but a lot of our other efforts as well. If you think what people do with their leisure when they don't have to work for a living, there's a lot of things we fill our time with that require some effort and investment that you may be mad and like, well, if you didn't have to work, you could do these other things. Like, you know, some, some people like, you know, go shopping. I don't quite understand, but some people think that's like a wonderful activity.

But you think, like, in this scenario where you have technological maturity, right. You would have recommender systems that could pick out something much better than what you would pick out yourself if you went. They would have a detailed model of your preferences and be able to predict. And so, although maybe you could still go shopping, you would know that at the end of this three hour running around with plastic bags, you'd end up with something that was worse than if you had just let your AI do the thing for you. It could select and also bring it to your house.

Yeah, exactly. So you could put in this effort, but the end result is worse. And it seems like that would put a little question mark over the activity. You could still do it, but would it still feel as fun and meaningful? Because I think a lot of the activities we do now have the structure that you do x.

Like put in some effort and work in order to achieve y, something outside the activity. But at technological maturity, there would be this shortcut to y and so that you could still do x, but there's a kind of pointlessness, maybe like a shadow. And a lot of activities, I think, have that structure. I mean, you could think of, like, spending time like child rearing seems like a worthwhile, important thing that gives a lot of people meaning. But if you sort of dissect it and look segment by segment at what it means, like, is the changing of nappies really something that you think is intrinsically, if you had a robot, you could do it just as well, be pretty tempting just to kind of press the robot on button and it would do the thing.

And so a lot of that, I think would, yeah, like, go away. At least its appeal, if there were these shortcuts. So that's like another layer, but there are more layers. So you don't think, well, certain things. Like, I mean, you like fitness and so you want to, like, you can't rent a robot to run on the treadmill on your behalf.

Right? That's like, you definitely can't automate that, you think. But, well, at technological maturity, you could pop a pill that would induce the same physiological effects in your body as, like, one and a half hour of sweat and toil in the gym, including the psychological effect of kind of feeling relaxed and energized. And so if that were the case, then, yeah, then does it still feel like appealing to do the hard workout if you could just achieve exactly the same result by a pill?

So I think what you have is, first the kind of post work condition we talked about earlier, and then there's this broader condition of post instrumentality that all the things we do for instrumental reasons, with a few exceptions, but, yeah, those would also become obviated, it seems. And now we have this further affordance, which is a condition of plasticity, where we, ourselves, the human body and mind, our psychological states, becomes a matter of choice. We become malleable. Like at technological maturity, you would have various, the crude version might be various drugs without side effects that have very tailored effects, but you could also imagine more direct kind of neural technology that allows you to have fine grained control over your mental states and cognitive states. And emotional permanently blissed out with some microscopic nodes that is able to manipulate your brain in some way or change your brain, or all of your fears and anxieties are gone, and all of your worries and concerns are gone, and you're just at this sort of peak MDMA state.

And then if you don't want that anymore, it knows, and it's able to create a state that you couldn't even think about. And there are no constraints for you to do it either, right? Yeah. So that, I think will become possible at technological maturity. And so then all these activities that you currently do, say, maybe you do them because it gives you joy and pleasure and happiness.

They also would be unnecessary in that there would be this shortcut to joy and pleasure and happiness, like the direct brain manipulation.

So you have this quite radically different condition where the world is solved in the sense of the practical problems have been taken care of, but also in the sense of maybe dissolved in that a lot of the fixed points and hard constraints that shape our current lives are kind of solved in this solution of technological advancement.

Then we get to really, the heart of the problem that the book is trying to think about is, like, in such a condition, what would a great human life look like? What could we actually achieve in terms of realizing human values if we had all of these affordances, all of these options?

It's so strange. The thing that reading the book that stood out to me is how much of what we seem to value and take pride in are kind of like clever ways to deal with scarcity and the fact that much of what we do is instrumental to striving and achieving some future goal which requires effort. It seems like, in a sense, much of human philosophy and value is just negotiating with a world which is effortful and constrained. And we are trying to find ways cognitively to deal with this sort of pressure that we have to lean up against in order to cajole the world to deliver the thing that we want. Yeah, yeah.

These practical necessities have been with us since, I mean, through the entire history of the human species. And indeed, beyond that, it's like the human nature has kind of evolved and been shaped in a condition where this is always present. There are all kinds of things we have to do and cope with and struggle against. So it's almost like if you think of, like, a little bug that has an exoskeleton, right? And then it holds the squishy bits inside together.

But if you imagine removing the exoskeleton, there's, like, just a blob there. And similarly, the human soul might have as an exoskeleton all these instrumental necessities that evolution can just assume are present because they've always been. But if you were to remove those, then what becomes of the human soul and the human life? Does it become a kind of pleasure blob? Or is there something that could give structure to our existence even after these instrumental necessities are removed?

So much of what we seem to take pleasure in as well is the absence and then satisfaction of some. Some desire. There is a thing that we want. We don't currently have it, and then we get it, and then it gives us something we work hard to achieve, a body that we're satisfied with. We are thirsty for a while, and then we get a drink.

We want to have sex, and then we do. We are looking forward to having a child, and then it's born. We are. All of these things are on the other side of something. And I yeah, if, like, you do x to get y, but you can just always immediately get y without having to do x.

It does ask the question of where does the absence? There are no longer any absences. There's this quote, something to do with. In a perfect world, the only desired, the only lack would be for the want of lack itself, which is this sort of. The absence actually makes the presence of something finally valuable.

And if you don't have any more absence, then what does all of this presence kind of mean? Yeah, I think there might always be a whole bunch of absences in as much as human, if not human, need. At least human desire, or at least some human desires are kind of unlimited.

It's maybe most clearly seen if you have two people who want exclusive possession of the same thing are two people, each of whom wants to have more than the other two billionaires, who want to have the world's longest yacht. And so one has like a 150 meters long yacht, and then the other builds a slightly bigger one that is kind of intrinsically unlimited, because there's no way that they could both have everything they want. And so there might be a bunch of desires like that, that are quite common, that could never be completely fulfilled. Or just imagine somebody who is utilitarian and who wants there to be as many happy people as possible in existence. Let's say however many they are, there could still be more.

And so they would always prefer to have more resources.

But even if there are some such desires, it wouldn't give these future utopians necessarily any reason for laboring or exerting effort, because there might just not be anything they could do themselves to increase the degree to which these desires are satisfied. I mean, the person who has a trillion dollar, maybe they would want to have $2 trillion, but they can't actually make more money by working, because all the work is more efficiently done by machine. So, yeah, even with unlimited desire, you might still have this condition that is both post work and post instrumental. Do you think that humans would run the risk of getting bored in a utopia? Not if they didn't want to.

At least if you, by boredom, refer to a subjective state of, I don't know, like some kind of restless, discontented, uneasy feeling of having hard, like having a difficulty keeping your focus, or like. So that certainly would be amongst the things that could trivially be dispelled through advanced neurotechnology. I mean, you already have, like, drugs that could do it for a limited period of time now, right? With side effects, and then they wreck your body. But, like, it's easy to imagine how you could just have better versions of that.

That would make it possible for you to always feel extremely interested and excited and motivated. And in fact, some people have. I mean, there's a lot of variation amongst humans. And, I mean, I have a friend who tells me he's never bored, and I believe him. I've never seen him bored.

He's kind of interested in everything except sport. And, like, he writes papers on all kinds of different weird topics, and he's like, just constantly excited about learning new things. And you can have a conversation with anybody about anything. And he's like, really? So it's an existence proof.

It's possible to be that kind of being. And in the future, we could all become such beings if we want to. So subjective boredom would be like, trivially, uh, easy to dispel under this condition. Now, it is possible also to have a more objective notion about boredom. Or maybe we say boringness, to refer to this objective notion, which is the idea that certain activities are intrinsically boring, like, meaning maybe that it is a appropriate to feel bored if you were spending too much time doing them.

It's like, kind of maybe an open question whether this notion of objective boringness makes sense. But you might think of, like, say, counting blades of grass on the lawn. Like, suppose you had a being who found it extremely fascinating and, like, never ending source of joy to just count and recount the blades of grass on a college lawn somewhere. You might say that although subjectively he's not at all bored, that objectively what he's doing is boring. There's no variation or significance or development.

And the appropriate attitude for somebody to have if they were spending their whole day doing that, would be to be subjectively bored. If you have this notion of objective boringness, then it becomes a much less trivial question to ask in this hypothetical condition of a solid world. Like, would it be possible for us to avoid objective boringness? Like, yeah, we could engineer ourselves. We always felt interested in what was going on, but would we be doing anything inappropriate?

Like, would our circumstances be such that the appropriate attitude would be to be bored? And so there's a big discussion about this. Yeah, so, and I think it's.

I think there are certain forms of interestingness that we might run out of. For example, you might think it's especially interesting to, I don't know, be the first to discover some important truths. Like Einstein discovering relativity theory might be like a kind of paradigm case of like, an extremely interesting discovery and experience. But it's plausible after some while, that most fundamentally important insights about reality that we could have, we already have had. And in any case, the machines will be much better at doing the discovery than we would be.

And so we would kind of run out of the opportunity to achieve that kind of interestingness in our lives. Yeah. I mean, so much of what humanity's done is been chasing down, answering big questions. So where do we go when all of the big questions have been answered? Yeah, I mean, I think, fortunately, for the most part, it's not what we're actually doing in our lives.

I mean, most people are not spending most of their time trying to chase down the answer to the big questions, right? Like, most of the time we're just going about our daily business. And you could make the case that already, I mean, if you really looked at it, I mean, from a kind of unbiased outside, like if the alien super brains came to earth and thought, okay, so look, these guys are worrying about losing out on what's interesting in life. Well, let's look at their current life and see how interesting there is.

How many times did this guy brush his teeth? Okay, well, 40,600. Like, how interesting is it to brush your teeth for the 40,760th time? And then, all right, so he commuted into the office and then he ate a steak. Okay, I mean, how interesting is it to do that again and again and again?

And even the big highlights in our lives, like, I mean, maybe they are like really novel and exciting. If your scope of evaluation is a single life, like the first time you see your own newborn, like, it just happens once, right? So there are a few of those, but if you zoom out and look at humanity, it's kind of, well, it's already been done tens of billions of times.

How different is this particular newborn from all the other newborns? So, depending on how you look at it, you might kind of either think that we are already at the very low rung of the ladder of objective interestingness, or if you sort of shrink the focus of evaluation enough to a single life, or perhaps even just to a single moment in a single life, then, yeah, then there is more novelty, but also opportunities for the same kind of thing to happen in utopia. If you're just looking at the most interesting possible moment, and you don't care about whether similar moments have existed before or after, then you might think the average human moment of awareness is very far from the maximum of interestingness. What do you think would happen to religion? That's obviously a place that an awful lot of people take their meaning from currently.

Is there a place for religion in a deep utopia? Yeah. So this is one of the things that possibly survives this transition to a soul world, which could remain highly relevant, even like, if we had all this fancy technology and it might constitute a bigger part of people's lives and attention than it does today, because there would be fewer other distractions if you want.

What else? What are the other areas that are uniquely human or that would survive this transition? Well. Yeah. So, I mean, you can kind of build it up, starting with the most basic value perhaps, which is just this sheer subjective well being, pleasure, enjoyment, which obviously would be possible to achieve in utopia.

And not just achieve, but like, you could have prodigious quantities of this bliss. And so that's intellectually not maybe super exciting to discuss at great length, but I think actually super important. Like, it's easy to dismiss, aha. These are like sort of junkies just having their, like, heroine drips. But the key question here is not like, how exciting is this future?

Or how admirable is it from our point of view, as if we were sitting in the audience, like, evaluating a stage play. Like, that's one perspective. And then we want a stage play with a lot of drama and suffering and tragedy and overcomings and heroism. But the question here is, which future would you actually want to live in? And there one of very great levels of subjective happiness and well being.

That might be the most important thing about the future, in fact. And you could definitely have that in extreme degrees. So it's worth making a note of that. Let's put that in the bank. At least we could have that.

And that's already possibly, according to some people, it's the only thing that matters if you're a hedonist, a philosophical hedonist. But for most other people, it's at least one of the things that is important, even if not the only thing that is of value. So that's the first thing, then you can add to that experience, texture. So it's not the case that you could only have subjective well being. You could attach that to some intricate, complex mental state that relates to some important objects.

For example, you could experience the pleasure not just as a sort of unanchored sensation of well being, but you could attach it to, say, the appreciation of aesthetic beauty, or appreciation of great truths, or great literature, or contemplating the divine. And that's what you derive the pleasure from. So your conscious state is one of insight, let us say, or understanding, or appreciation of things that deserve to be appreciated and understood. Some people think that that is also a locus of value, not just the hedonic, like the scale of whether it's like plus ten or minus ten, but having plus ten whilst you're understanding or seeing or appreciating something that is actually lovely and worse, understanding or profound, makes that a more valuable condition. So you could have that, then, if we go to some of the other values that seem more at risk, like purpose, you could certainly have artificial purpose.

So you could set yourself goals in utopia in order to then enable the activity of trying to achieve them, like. Playing a sport against another person. Yeah. So games is a paradigm example of this. In today's world, you set yourself some arbitrary goal to get the golf ball into a sequence of 18 holes using only a club.

And there's no other reason for why you'd need to achieve this goal other than to enable the actual activity of golf playing. And that that could become a much larger part of the utopian lives, various forms of game playing. Like you could make all kinds of new, much more sophisticated and immersive games alone or with other people that involve setting yourself sort of arbitrary challenges, or at least semi arbitrary challenges, simply in order to then create an opportunity for the activity of striving to achieve them.

And, yeah, so to enable play, we kind of like deliberately limit the means available to you to achieve this arbitrary goal. So just as a golf, because if. You were in the post scarcity world, you press the button and there's never any doubt about whether or not hitting the ball goes into the hole, which makes the hitting of the ball completely arbitrary. Yeah. And uninteresting and maybe objectively boring.

But you could just set the goal to achieve this sequence of outcomes, the ball falling into the different holes, whilst also not availing yourself of various shortcuts. You could sort of make this more complex goal of achieving x while only using means. There needs to be some constraint which then gives a degree of satisfaction when you achieve it. Yeah, yeah. And like, either to give you the satisfaction, I could achieve the satisfaction just directly through the newer technology.

But if you also, in addition to the pleasure, wants to have the experience texture and you want to have the sort of effortful activity and striving, then, yeah, you could achieve that by having these artificial purposes. It kind of feels to me like you very quickly keep coming back to the same question of is there a quicker route to achieving the outcome that im trying to achieve here? Ive had it in my head since you were talking about this and since I read the book about churning butter. So theres very few people that would look at the butter that they use now and think, I know that its here and its convenient and tasty and does what butter needs to do by being lovely on bread or whatever. But I feel like it would have been more meaningful to me if I'd gone out into the field and got the cow and done the thing into the bucket and then churned it and then got it and then put it in the fridge and all of those steps so we can see, and there's kind of this inherent sense, this sort of naturalistic fallacy that this is taking us away from what it means to be human, that the set point that we have grown up in, it's a misalignment, evolutionarily.

There's something sacred about the process of being human. It imbues you with meaning to go through the challenge and the struggle beforehand. But there's very few people that would make that argument about churning butter. And when you think, okay, so if you are happy with more convenient butter, I think that something right now, which is assumed by most people to be natural but in future will be looked at as barbaric, will probably be driving our own cars. I think that in 50 years time, 100 years time, it'll be like if you looked at someone riding a horse down the street now, and you go, I mean, isn't that so quaint and wild that people used to do that?

That was the way that they got around it. You know, they had to have these special people in New York that would sweep up all of the muck from the street. This entire industry built around horses. So again, right now, something that we can almost begin to see the transition of. We're about to let go of this thing, which is a less efficient, less safe, more effortful way of getting us from a to b.

And yet some people. I love driving. I take great pride in driving. Some people even compete in it. You know, f one is an entire competition around people that are driving.

So you can see different frontiers of human endeavor being eroded away by technology, whether it's from churning butter or driving a car. And you just continue to slice that ever more thinly all the way to, why are you here? What are the sort of relating to other people having to get yourself out of bed and move yourself down the stairs on the morning? Each one of these different things could begin to look okay. Isn't it cute that people used to pick themselves up out of bed and put their own clothes on and walk downstairs and brush their teeth?

And then you ask yourself the question, well, if everything is open to you and you can manipulate your own internal. State, why not just spend the rest. Of your life counting table legs or blades of grass? Right? Indeed.

Yeah. So you are forced to confront these fundamental questions of value in this condition.

What things are you doing for the sake of something else? Versus what things are you doing truly for the sake of the activity itself? So even the guy now who maybe likes to do their own butter, there is a question of is it because they intrinsically value the activity? Or is it maybe because of the pleasure they get out of it or because of the way it teaches them about their own body and about the cow and the physical objects and puts them in touch with that? That's a kind of extrinsic element.

But, yeah, so these things that it's currently, we can conflate them because in reality, the only way, maybe, to get various kinds of pleasure is to dive into activities and give it your best. And then you get the satisfaction. Yeah, we can't. We can't, like, separate these today. But in this hypothetical condition, they can be separated.

And then you do have to ask the question of what precisely is it that you actually value? So this conception, I mean, so for me it's like interesting because I think I. There is a real chance that if things go well, we might actually end up in something like this condition with the whole machine intelligence revolution, et cetera. But even if you thought that was not going to happen, you could view it as a kind of philosophical thought experiment.

Just like physicists build big particle accelerators at which they smash atoms together at extreme energies to sort of see what their constituents are. And then you can assume that if there are quarks, when you smash the particles together in CERN, maybe there are quarks in other matter. And you can kind of learn from you expose basic principles by looking at extreme conditions and extrapolate and think they might be there all the time, even though we can't see them. Similarly with human values, if you kind of smash them into one another under this extreme condition of assault world, you can study their constituents, and then you might think that, well, in our ordinary lives, maybe those same constituents are there. They are just kind of invisible top because they are hidden by all the kind of practical necessities.

What are the implications? If humans live for a very long time, does anything change there? Yeah. So certain values are more jeopardized by extreme longevity. For example, interestedness, as we discussed earlier, if your notion of interestingness involves the idea that something has to be novel to be interesting, like it's uninteresting to just do the same thing over and over.

And if the domain within which it has to be novel is your own life, as opposed to, say, the world as a whole or your species or the current moment. But if the relevant sort of locus of evaluation is a human life, then the longer the human life goes on, the harder it is to do important things for the first time. I think we already see this in our current lives, though. If you think about what happens in the first year or two, there's some pretty big epistemic earthquake. You discover there is a world out there that discovery, it has objects.

Like, the objects remain there even when I'm not looking at them. Like, there are other people there, like my friend in the world. Like, imagine just discovering for the first time that there are people. My friend discover that you have a body. Yeah.

Wow. And you're separate from mum and dad and you can communicate. Yeah. My friend described a his son was born during COVID and I think for the first maybe year of his life or something, had only seen four people. He'd seen like, mum, dad, grandma and nanny or something like that.

And then apparently one day he saw a fifth person and it fucking blew his mind. He was like, what? That's more than four, right? And then, yeah, you discover that you can do things, move your body, and then, like, later in life, it's like, oh, what happened this year? Well, we got a puppy, we bought a caravan truck.

It's not really at the same order of magnitude in terms of how much it reshapes your view about reality. So I think there is already, within a human lifespan, a kind of rapidly diminishing if you measure interestingness in a certain way, where it's kind of the delta between your previous conception of the world and what you could do and what you're able to do after the event that is then interesting. Right? So there is another conception of interestingness where it's less the rate of change and more sort of the complexity of what you're engaging with at the particular moment, in which case maybe the level of interestingness of a day, of the typical adult, by that metric, might be higher than that of an infant because, like, you have all kinds of complicated things going on at work and relationships, right? So rather than having big questions being answered, you have increasingly dexterous small questions with more magnification and you can sort of see them with more complexity and you derive some pleasure from that.

One of the things that I got in my head, if humans do live for a much longer time, you're going to be able to continue producing humans. That will have to be in relation to the increase in computing power. So there'll be this kind of malthusian tug between how much computing power have we got to be able to support how many humans and which can move more quickly. Have you considered this sort of tension between the two things? Yeah.

So in the long run, I think economic growth becomes really a consequence of growth through space, the acquisition of more land, as it were, by space settlement. Like, once you have achieved technological maturity, you can't have economic growth by inventing better methods for producing stuff. And you also probably can't have more growth by just accumulating capital, assets and machines, because you already have built all the machines. That results in optimal productivity for the volume. So ultimately, the limiting constraint become what economists call land, but it basically means those resources that you can't make more of.

And so in the long term, you could imagine human civilization kind of expanding through space, but there is a limit to that, which is the speed of light. So if you have a sphere with earth at the center, maybe, and then growing at maximal speed in all direction at some fraction of the speed of light, that the volume of that would grow polynomially. But population could go exponentially. It could double every generation or so. In the long run, an exponential overtakes a polynomial.

So at some point, you would need to moderate the rate at which new beings are brought into existence if you want to maintain a sort of above subsistence level of welfare. That's a really interesting point that I hadn't considered. So you can have a solved world in which almost all problems have been defeated, but there are still some constraints. Speed of light is one of them. What are some of the other constraints that a utopian world would encounter?

Yeah, so there's a bunch of basic physical constraints, like the speed of information processing, the amount of memory you can store, like the size of a mind that is integrated. Like, if you make a mind much bigger than a planet, then you get conduction delays. Like, it just takes time for one. Like, what happens in one part of the mind to kind of communicate to what happens in a different part of the mind. So either you have to run the mind much slower, or you have to keep the mind relatively small.

There might be. I mean, we are hoping not, but you could imagine if there are other alien civilizations out there, there's, like, the potential for all kinds of competition and conflict. So, yeah, so there's like a bunch of. Of those external physical constraints that I think, define the ultimate envelope of what could be done. But it looks like the space of possibility is very, very large compared to our current human vantage point.

So you could maybe not have immortality if that requires not dying infinitely long. Time looks impossible in our universe. Like, eventually information processing threads will, if not before then, the heat death of the universe, which is kind of significant because from a theological perspective, like, whether you live for 80 years or 80 million years, it's all kind of really a blink in the eye of eternity, you could argue. And so it doesn't really change fundamentals, but from the kind of parochial perspective of a current human life, you could certainly have extreme longevity and extreme amounts of wealth and extreme amounts of most other things. But still there are limits.

And those limits would be relevant in various ways. What about moral constraints? Would there be any? Yeah. So this is another, more subtle, but potentially very important source of constraint.

Whats the easiest way to. Some people have thought, for example, its immoral to enhance humans biologically. Its not a very popular view these days, but during President Bush, he had a council on bioethics that he set up and populated with a bunch of bioconservative thinkers. And they were trying to argue that its somehow is a violation of human nature or something to try to enhance humans. So like, distinction, like therapy, like medicine curing a disease, fine.

But like trying to slow the aging bad, because it kind of, you know, and it's a little like once you start to think about it, it's really hard to make out that distinction. Like, you think, like, you know, genetic therapy to make you smarter bad, but education good, even though it hopefully makes you smarter and, like, it becomes problematic. But if you did have that view, like, then there would be a whole bunch of possibilities that would be cut off if you just couldn't change what, like the basic physiology of what we have. And you were confined just with like moving things around in the external world to try to cheer you up by like having a really nicely decorated room or something, or like a. Like, there's only so much you can do to affect your inner well being if you can only affect it by having sort of nice visual stimuli and nice acoustic waves going into your brain.

If you can't actually change the thing in between the ears and behind the eyes. But there are more potentially some other interesting ones that if somebody, suppose somebody had some preference to have another person relating to them in a particular way, like the experience of being loved by a particular other kind of person, then it might be that the only way to generate that experience fully, realistically would be by instantiating that other person. And if that other person then, like, presumably would have moral status, there might be all kinds of ways of treating that other person that would be wrong. So a moral constraint might then limit the kinds of experiences you would be able to have. Wow.

With another person. Yeah, I suppose as soon as you involve somebody else who has moral consideration, that changes quite a lot. Yeah. And in fact, I think that is. So we discussed these artificial purposes that are sort of like, you could create games and set yourself goals.

Like, that's one type of activity. I think there are also a possibility of a bunch of natural purposes remaining. Like purposes that would call upon us to make various kinds of efforts, not just because we create random goals for the sake of having something to do, but that kind of exists independently of us. And a lot of those would derive from this kind of interpersonal entanglements and various kinds of cultural entanglements. Um, where like, I mean, to take the sim, the most reductionistic case of it, which is not so inspiring in its own right, but you could imagine more natural versions of this.

So suppose you have person a and person b, and person a wants person b's preferences to be satisfied. Like they care about person b and wants person b to get what they want. And then if person b happens to want person a to be doing something on their own steam, then the only way that person a can achieve their goal of satisfying person b's preferences is by themselves doing this thing. Like, they could have a robot do it. But that wouldn't satisfy person's b preferences.

So from the vantage point of person a, they now have reason to do this thing. It's not an arbitrary goal they set themselves. It's the only way they could possibly achieve their goal of satisfying person B's preferences. So in this kind of, like, it seems to be token artificial, this particular case. But you could imagine more subtle ways of this where there is like a tradition that you feel a commitment to and that you want to honor.

And part of that tradition is that, you know, you engage in certain kinds of practices, you refrain from certain kinds of shortcuts. You respect other people's preferences to various degrees because they are the. Yeah, you want to honor them. And so there would be a bunch of stuff then that maybe you need to do yourself and you can't outsource them. So you've managed to, over the last decade, straddle all the ways it could go wrong, all the ways that it could go right.

Toby then sort of bifurcated that down the middle with the precipice, his book. And he's got this analogy where he sees humanity being sort of walking along a precarious cliff edge and I if we fall, then everything's kind of fucked. And if we make it on, then there's this sort of beautiful meadow on the other side. How important or critical do you think the current moment is in humanity's future? How long is the precipice in your perspective?

Yeah, I think. I mean, it is weird because it looks like we are very close to some, like, key juncture which you might think is prima facie implausible. So there have been thousands of generations before us, right. And if things go well, there might be millions of generations after us or people living for, like, cosmic durations. And, like, out of all of these people, that you and I should happen to find ourselves just next to this critical juncture where the whole future will be decided.

It is striking that seems to be what this model of the world implies. And maybe that is an indication that there is something slightly puzzling or implausible about it, that there may be some more aspects to understanding our situation than is reflected in this naive conception of the world and our position in it. And you might speculate what that. I mean, I have this earlier work on the simulation argument and stuff like that. But if we take the sort of naive view of reality, then it does look like.

Yeah, my metaphor would maybe more be like a balance beam where like a ball is rolling down like a thin beam, and the longer it rolls, the more likely it will be to fall off the beam. But it could fall, like on one side or the other. And that hard to predict, but yeah, I think it probably will fall off like that. The idea that the normal human condition as we now understand it will just continue for like hundreds of years, I mean, let alone like hundreds of thousands of years, that seems to be like the kind of vague idea that a lot of people have. It just seems like radically implausible to me.

It would be unlikely, in your opinion, that in a thousand years time, 5000 years time, human existence will reflect what our normal sort of day to day is now. Yeah. And I think the only plausible ways for that to happen, you could create some scenarios. Like one would be if we do sort of have some massively destructive event that knocks us back to the Stone age or something, and then maybe by 500 years we would have climbed back up again to something resembling the current human condition. But then in that scenario, we would have spent most of the intervening time in a rather different condition.

Or another might be if you get some very strong global consensus of some. Particular orthodox moratorium of something. Something, yeah, like the kind of bioconcept. And then we start to ban all kinds of different technologies. So you get some sort of bureaucratic sclerosis or deliberate decision to.

To say, all right, we've come to this point, but let's not. So there are various scenarios in which something like that. But if the basic sort of scientific and technological push forward is allowed to continue, then it does look like we are sort of very near developing a range of transformative technologies. So AI being kind of the most obvious of those. But if it weren't for that, then I think synthetic biology will create a bunch of other possibilities, and then nanotechnology.

I think even if AI were somehow like, if you just pretended that wasn't there, I still think we would be in for profound transformations. That's interesting to consider that if technology moves forward, even at a slow pace, even if it was to drop by a really significant margin from where it is now, given long enough time, you end up with a radically different world, in either a way that you intended or a way that you didn't intend. And the way that you didn't intend is probably going to be pretty bad. And the way that you intended, hopefully, is going to be the one that is pretty good. Either way, you end up with a radically different day to day experience for most humans.

Yeah, I'm not so sure about the first part. If it's something we didn't intend, then it would almost certainly be very bad. I mean, you might think the world that we have currently ended up with, I'm not sure whether you could say that's what people 1000 years ago intended.

They might in fact be quite shocked about some of our habits these days. But it more just sort of happened as a result of a bunch of different people going about their business and pursuing various local aims. And then at the systemic level, eventually. Unintended consequences can still be positive. Yeah, I mean, I think the degree to which the future depends on our intentions is possibly quite limited.

There are sort of bigger dynamics at play, and we barely even understand what they are. We don't really know what we want at a big scale. Like, most people have their hands full just thinking about their next week, you know, what to do if their boss don't like them at work or their partner has a. Like, this is what fills human life. And then trying to get ahead in a lot.

Like, there's very little thinking by anybody really, like trying to where, where should humanity be going in a million years? Like, what's the optimal trajectory? I think that we could do with a little bit of more thinking about that, but it's not like the primary shaper of the direction of the big ship of humanity is. What have you been most surprised by over the last ten years when it comes to AI development? I think just how anthropomorphic the current generation of AI models are.

The idea, first of all, that they are almost human level, and that they can talk in ordinary language is already kind of interesting, but then that they even share some of the quirks and psychological foibles of humans. If you, ten years ago, you would have come and said, well, we're going to have these AI systems. They can do all of these things. They can program and write poetry, but if you really want them to perform at their best, you need to give them a little pep talk. When you ask them a question, you're going to say, think step by step.

This is really important. I might lose my job if you get the answer. And then they perform a little bit better than if you just ask them the question. People would have thought you were completely lost your marbles. And yet that's where we are today.

So that's surprising.

I think less surprising, but still interesting, is the degree to which development so far has been continuous, like rapid but incremental, like a sequence of steps, each of which has sort of significantly but incrementally improved on the previous step, and the degree to which the progress is quite tightly coupled to the scale of compute being applied to this. So you have this big compute hypothesis, which is like, basically that the most important determinant is not the particular architectural features of your model, but just the amount of compute to use in training and, like, the amount of data. And, like, you get performance in proportion to, like, how many dollars you spend on training it, basically, that's, like, too crude. You also need some skilled engineers, but we are kind of closer to that being the case than one would maybe have expected in the median scenario, where you might instead have thought, ah, we're gonna fumble around until we find this clever algorithmic hack, and then suddenly it's. Gonna explode, it's gonna open something up.

Yeah, yeah. Whereas it's like, more like, just scale it up. It works better. Scale it up more, it works even better. Now, it is still possible that at some point, some little last missing bit will fall into place, and we could still get an intelligence explosion in DOS scenarios.

So we shouldn't overdose index on what we've seen so far, but it's still interesting. Does that change your perspective on what is more or less likely from a takeoff scenario, from how superintelligence could come about, stuff like that? Yeah, I think it makes it somewhat more likely that there will be political forces at play, that when things happen more gradually, it's easier for the public and for policymakers to realize what is happening and to sort of try to change it. And so we already see sort of at the geopolitical level with, like, the chip export restrictions and more recently, reporting requirements for training, like, models using more than ten to the power of 26 flops. And there might well be more.

If we continue to see sort of increasingly powerful AI systems over a sequence of several years, there might be time for more actors to kind of try to exert influence of this. Then if it were just some lab one day that stumbles upon the key missing thing with a computer in their basement, and you go overnight, then it would be more likely to be an isolated thing where just a few people were having their hands on the tiller. Have you got any idea which scenario you think is more optimal?

It's really hard to say. I think it seems probably better if whoever develops this technology first has the option, when they are starting to develop true super intelligence, to go a little bit slow in the final stages, maybe to pause for half a year or something, rather than, okay, now we've got it figured out, and then immediately cranking all the knobs up to eleven, because maybe there are 19 other labs racing to get there first. And whoever takes any precautions just immediately become irrelevant and fall behind. And the race goes to, however, is like most gung ho are willing to take the biggest risks. That seems like an undesirable scenario.

And so having some ability, perhaps, for the frontier labs to coordinate, or unless one is already naturally significantly ahead, it may be a small set of leading labs should at some point be able to synchronize. That could be desirable.

I think it's very unlikely, but less unlikely than a couple of years ago, that we could end up with some kind of perma ban on AI.

I think that would be undesirable. I think ultimately, this is like, it's a portal through which I think humanity will need to pass it to the future. But we should recognize that there will be significant risks associated with this transition. The slower as well that this happens, I suppose the more opportunity there is for political policy, human fuckery, to get in and coerce and congeal. So it's very much a double.

On the one hand. Yeah, you do want, like, it's kind of uncomfortable either way. Like, so some random person in some lab or just going to control the future. That sounds, like, pretty scary. You want definitely, like, adults to oversee this, right?

But then you think the other end, like, well, you know, all the security establishments of governments around the world, like, I know everything. Like, is that like a much more comfortable situation where they like the military. Not just one military maybe, but like. And then you get all. So either way, I think it's a.

It's a little bit disconcerting. So I don't feel super comfortable. I don't have a very strong view at the moment as to what is the most desirable trajectory, and it might be, anyway, something we don't have super fine grained control over. I think one can try to nudge things on the margin towards a more cooperative, inclusive and friendly and thoughtful trajectory. I think that seems good to do to try to encourage this idea that the future could be good for both humans and for digital minds and for animals and for as many people as possible.

And there really is that potential there. Like, the upside is so enormous that there could be plenty for not just one value to be realized, but for a whole range of different values and perspectives. So our first instinct, I think, should be to seek these kind of win win positive sort of outcomes. And then if at the end of the day, there are also some irreconcilable differences, we'd have to strike some compromise there. But there's so much you can do before you get to that point that it would be tragic if you just kind of skipped over all of that.

And let's get to the point where we can fight about something that would just be a big tragedy. What is the current state of AI safety, in your view? Obviously, ten years ago, conversations about alignment and takeoff scenarios and all of the rest of it was obscure Reddit threads and a couple of people in some weird message boards. Is it overfunded? Underfunded?

Over resourced. Under resourced. Where should people's attention be placed at the moment? Well, there's a lot more talent in the field now, and I mean, a lot of the smartest young people I know are going into AI alignment and working on it, and all these leading labs have research teams. As I said, it's probably still under resourced.

I think it looks more like talent constrained at the moment rather than funding constrained. But to some extent funding can help. There are some questions about whether alignment work spills over to capability progress, like some of the things you would want to do for alignment, like better methods of interpret what is going on inside a little mind and figure out exactly why is it behaving the way it is. That would be useful for alignment, but it could also shed a light on what's limiting performance and how to boost it. It gets pretty complex, I think some other things, like better cyber security in the leading labs to make it less likely that the weights of these models just get stolen.

That could maybe be helpful.

I think more work on alignment seems positive. Having some ability for leading labs at a critical time to go a little bit slow, it seems positive. I would be. Would that require coordination between multiple labs in order to be able to do that? Because you do have this sort of first past the finished post.

Yeah, it depends on, like, I think the original one older idea, and which might still be relevant, is that maybe you would have one lab or, you know, whether it's one country running one lab or one just private industry lab or whatever it is, but, like, one would have some lead over the. Like, just naturally, some, like, one lab might just be a year or two ahead because they started earlier, were more lucky or had better talent or something. And so then that would create an opportunity for this leading lab to slow down for a year or two, however long their lead was right, without falling behind. And it might be desirable if rather than having like, a super competitive race, you had a little. And then that kind of pause would be self limiting, you see, like, because once they have paused for two years, that would be another lab kind of catching up, and then maybe they could pause too.

But you would have to make an increasingly strong case for pause as more and more independent actors became capable. So it would be a pause that could exist and it would eventually expire, and exactly when it would expire, it would depend on how strong the argument was for AI risk, and that would create a much lower risk of ending up with a kind of perma ban where this technology is never developed. Whereas if the path to getting the ability to pause for a year is to, say, set up a big international regulatory regime, or creating a lot of stigma around AI research, like developing some mass movement, like smash the machine type of thing, that's much more likely to spill over into something that then become a permanent orthodoxy or a regulatory apparatus that just has an incentive to perpetuate itself. So, more worrying from that respect, how. Impressed have you been with the power of LLMs?

Do you think that they are going to be the bootloader for what we need from a super intelligence perspective? Or is this, have you got limited hopes for how far they can sort of climb functionally? Well, we haven't yet seen the limits of what one can do when scaling these. I think these transformer models, it's not just language, but they could have other modalities as well. But they do seem very general and a lot of alternatives that people try turn out in the end to basically just result in similar performance as transformers.

The transformers run well on the current generation of hardware, like the parallelize very well, et cetera. And so it might be that you need a little thing on top of that, like maybe it's like the engine block and then you need some sort of agent loop or maybe some external memory augmentation or some other little thing, but that you would still have this big kind of transformer or something similar to it. Like there might be some variation, but as the basic thing that extracts statistical regularities and abstractions, that thats pretty plausible. I think. Its interesting.

I certainly wouldnt have guessed ten years ago that something that you have a conversation with and that is able to accurately predict what it would say would actually be the forefront. It seemed to me tracking quite closely from whatever 2015, 2016, the development of AI that I think your book and then subsequent conversations around AI risk kind of blew up that conversation. And then it seemed to me that maybe the 20 18 20 19 20 20 AI hadn't really delivered the threat that people perhaps slightly earlier in the 2010s were worried about. And then chat GPT comes along and this conversation just gets thrust straight back into the forefront of everything. So it seemed like it had a thrust and then a little lull and then it's really, really sharply come back up again.

Yeah, I think it's, I mean, yeah, why shouldn't over index too much on like any one latest little development? But I think also people's expectations change. So now, like, wow, it's been like four weeks without an immediate new release. It looks like AI winter, like it was all just hype. And if you zoom out, I still think we are on an extremely rapid up ramp and have been since the start of the deep learning revolution in 2012 2014.

William. Nick Bostrom, ladies and gentlemen. Nick, I really appreciate you. I've been a huge fan of your work for a long time. Your book is in the hundred books that everybody has to read list that I've been pumping for a very long time.

Where should people go? They want to keep up to date with your work and your books and everything else. Well, I'm not, I'm not active on social media, so I think my website, Nick Bostrom.com, is where I put my papers and everything. So that might be the best place. Nick, I appreciate you.

Thank you for the day. Thank you.

Thank you.

#803 - Nick Bostrom - Are We Headed For AI Utopia Or Disaster?

Primary Topic

Episode Summary

Main Takeaways

Episode Chapters

1: AI Utopia vs. Disaster

2: Ethical Implications of AI

3: Governance and Societal Impact

Actionable Advice

About This Episode

People

Companies

Books

Guest Name(s):

Content Warnings:

Transcript