20VC: Mistral's Arthur Mensch: Are Foundation Models Commoditising | How Do We Solve the Problem of Compute | Is There Value in the Application Layer | Open vs Closed: Who Wins and Mistral's Position

Primary Topic

This episode discusses the evolving landscape of foundation models in AI, focusing on commoditization, compute challenges, application layer value, and the dynamics between open and closed systems.

Episode Summary

In this engaging episode of 20VC, host Harry Stebbings interviews Arthur Mensch, co-founder and CEO of Mistral. They delve into the intricacies of foundation models in AI, exploring whether these models are becoming commoditized and how the AI sector is handling the compute limitations. Mensch shares insights from his experience at DeepMind and discusses Mistral's strategic approach to leveraging AI technologies, emphasizing the importance of efficiency and customization in the application layer. The conversation also covers the impact of open-source models on the industry and the balance between maintaining an open-source ethos while managing a sustainable business model.

Main Takeaways

  1. Foundation models in AI are facing commoditization, but opportunities for differentiation exist in customization and application-specific tools.
  2. Compute is a significant bottleneck in AI development, with efficient use and management of compute resources being crucial for innovation.
  3. The value in AI increasingly resides at the application layer, where specific solutions can be tailored to meet user needs.
  4. Open-source models influence the AI landscape by fostering innovation and competition, though they present challenges in creating sustainable business models.
  5. Strategic collaborations and efficient resource management are key to navigating the competitive and fast-evolving AI industry.

Episode Chapters

1: Introduction to Arthur Mensch and Mistral

Overview of Arthur Mensch's background and Mistral's role in the AI industry. Focus on how Mistral is driving innovation with foundation models. Harry Stebbings: "Arthur, can you introduce us to Mistral's mission in AI?"

2: Challenges of Commoditization and Compute

Discussion on the commoditization of AI models and how Mistral addresses compute challenges. Arthur Mensch: "Commoditization is real, but differentiation comes from how we use and apply these models effectively."

3: The Role of the Application Layer

Exploration of the value generated at the application layer and how businesses can leverage this for competitive advantage. Arthur Mensch: "The real value lies in how these models are integrated and used within specific applications."

4: Open vs. Closed Models

Insights into the pros and cons of open-source and closed AI models in the current tech ecosystem. Arthur Mensch: "Open-source helps drive innovation, but sustainable models need careful balance."

5: Future Directions and Conclusion

Speculations on the future of AI and closing thoughts on the strategic direction of the AI industry. Harry Stebbings: "Where do you see the AI industry heading in the next five years?"

Actionable Advice

  1. Leverage Customization: Focus on customizing AI applications to fit specific business needs for a competitive edge.
  2. Manage Compute Resources: Optimize the use of compute resources to address scalability and efficiency challenges.
  3. Embrace Open Source: Utilize open-source models to foster innovation but develop strategies to sustain business growth.
  4. Focus on Integration: Integrate AI technologies seamlessly into business processes to enhance functionality and user experience.
  5. Stay Informed: Keep abreast of technological advancements and regulatory changes in the AI industry to maintain a competitive advantage.

About This Episode

Arthur Mensch is the Co-Founder and CEO of Mistral AI. Since its inception in May 2023, Mistral has raised over $520M in funding from investors like Andreeseen Horowitz, General Catalyst, Lightspeed Venture Partners, and Microsoft with a current valuation of $2 billion. Before founding Mistral, Arthur was a research scientist at DeepMind, one of the leading AI institutions in the world.

People

Harry Stebbings, Arthur Mensch

Companies

Mistral, DeepMind

Content Warnings:

None

Transcript

Harry Stebbings
Do you feel like you have enough cash now? I guess a startup is always fundraising. Do you think enterprises are ready for open source? The most technical savings enterprises are definitely ready for it. In order to widen the adoption, there's definitely some tooling to be brought to the market.

What are the biggest barriers to Mistral today? We are still bottlenecked by compute for sure, but that's because we don't have many of it. We have one five k 800, which is a few percent of our competitors. Can I ask finally, was it a mistake for you to not scale that quicker? You can't really scale that much quicker.

Arthur Mensch
You can't raise like 2 billion on the seed round. I mean, at least you couldn't in 2023. Which competitor do you most respect and admire?

Harry Stebbings
We are always mixing it up with new intro styles. Let me know what you think of. That new intro style and what a show we have for you today on 20 VC Mistral is one of the most exciting AI companies today. At the forefront of the foundation model charge. Joining us is Arthur Mensch, co founder and CEO at Mistral, where he's raised over $520 million in funding from the likes of Andreessen horror, its general catalyst, Lightspeed Venture Partners, and Microsoft.

And before founding Mister Owl, Arthur was a research scientist at DeepMind, one of the leading AI institutions in the world. But before we dive into the show with Arthur today, we're all trying to grow our businesses here. So let's be real for a second. We all know that your website shouldn't be this static asset. It should be a dynamic part of your strategy that really drives conversions.

That's marketing 101. But here's a number for you 54% of leaders say web updates take too long. That's over half of you listening right now. And that's where Webflow comes in. Their visual first platform allows you to build, launch, and manage web experiences fast.

That means you can set ambitious marketing goals and your site can rise to the challenge. Plus, Webflow allows your marketing team to scale without relying on engineering, freeing your dev team to focus on more fulfilling work. Learn why teams like Dropbox, ideo and orange Theory Trust Webflow to achieve their most ambitious goals today@webflow.com. And speaking of incredible products that allow your team to do more, we need to talk about secure frame. Secureframe provides incredible levels of trust to your customers through automation.

Secure frame empowers businesses to build trust with customers by simplifying information security and compliance. Through AI and automation, thousands of fast growing businesses, including Nasdaq, Angellist Doodle and Coder Trust Secure frame to expedite their compliance journey for global security and privacy standards such as SoC two, ISO 2701, HIPAA, GDPR and more. Backed by top tier investors and corporations such as Google, Kleiner Perkins, the company is among the Forbes list of top 100 startup employers for 2023 and Business insiders list of the 34 most promising AI startups of 2023. Learn more today@secureframe.com. It really is a must.

And finally a company is nothing without its people and thats why you need Remote.com. Dot Remote is the best choice for companies expanding their global footprint where they dont already have legal entities. So you can effortlessly hire, manage and pay employees from around the world all from one easy to use self serve platform. Plus you can streamline global employee management and cut HR costs with Remote's free hris. And hey, even if you are not looking for full time employees, Remote has you covered with contractor management, ensuring compliant contracts and on time payments for global contractors.

There's a reason companies like GitLab and DoorDash Trust remote to handle their employees worldwide. Go to remote.com now to get started and use the promo code 20 Vc to get 20% off during your first year. Remote opportunity is wherever you are, you. Have now arrived at your destination. Arthur, I am so excited for this.

Harry Stebbings
JC introduced us quite a long time ago now. I've known you for a while. I've been wanting to make this happen for a while. So thank you so much for joining me today. Thank you for having me.

Arthur Mensch
It's a pleasure. The pleasure is mine my friend. But I want to start what would your or how would your parents or teachers have described the young Arthur? I'm just always by the characteristics and traits of the best founders. How would they have described a 910 year old Arthur?

Yes, I was a bit curious and a bit stubborn. Not very nice to my brothers. I was the eldest of them also. I know you should ask them. I think they have good memories hopefully.

Harry Stebbings
Do you know what? Sadly your mother wasn't in our reference list so we missed that one out. But JC provided some great commentary. So I do want to start though. Also, growing up, what was your first exposure to AI?

You're a kid in France. How did you first get exposed to AI and machine learning? And what was that first passion point? That was Andrew Ng flying helicopter backwards. It's a control problem which is not easy to solve and I'm not sure if it was really AI related.

Arthur Mensch
I think he was saying that he was using a neural network to control all of this. But that's the. Well, the first memory of me being shown what you could do with machine learning at the time. That was in 2013, I think most. Recently, that you spent two and a half years, three years at DeepMind.

Harry Stebbings
Can I ask, what are the biggest takeaways for you from that experience, and how did they impact, how you think about building Mistral? A team of five is faster than a team of 50, except if you organize the team of 50, to be ten teams of five that are sufficiently uncoupled. One finding that I learned the hard way at DeepMind, and the reason why we created the company in a slightly different way, in terms of organization of the science team, and also the reason why we knew we had a chance to do interesting things with a smaller team. Can I just ask, sufficiently uncoupled, do you not lose efficiency, or is there not leakage between those silos? And it creates, actually inefficiency by having.

Arthur Mensch
Such silos, you have to share some things. So you share the infrastructure, you share the code base, you share findings. We're doing general purpose models. In general purpose models, you need to evolve them in different directions, so you need to make them speak different languages, you need to make them be able to code, be able to do mathematics, be able to reason. You need to add multimodality to them.

All of these things are loosely coupled. It's useful if you use the same framework for optimization, for data, for training, but you don't want to have your team spend their entire day in meetings for coordination. And it's actually pretty hard to figure out. I think so far, we've managed to scale it relatively well. The team is only 25 people, so that's actually not super challenging.

It will become more and more of a challenge. That's what I remember from DeepMind. It worked very well at the beginning. Gemini was a bit too slow, and I think they recovered sufficiently well since we have optimized the team to be as fast as possible and to ship as fast as possible. Was it an easy decision to leave, to start Mistral?

Harry Stebbings
You're at DeepMind, one of the best institutions in the world for AI, with some incredible talent around you. Was it an easy decision? And just take me to that moment when you decided to leave to found or co found Mistral. So it's not a zero to one decision, it's not a binary decision. You start to think like, I'm 10% leaning on living, and then it grows, and at some point, you cross the threshold, then you say, okay, well, I guess now that I'm sufficiently decided, there's no way in which I stay more than a few days.

Arthur Mensch
Otherwise I wouldn't be candid with my colleagues. And so that's how you get started. You say, there's no turning back. What was that point for you? That point for me was probably around March, end of March at the last year, where I decided to leave on Friday and I resigned on Monday.

You can't stay if you've decided to resign. Although it's not very fair, I totally agree with you. Now, I do want to run this with some chronology. I spoke to so many of your advisors, investors, and I want to start with, actually the first model, Michel, seven B, being one of the most popular, released a while ago. Now, why do you think it was so popular?

What do you think you did so. Right, and what did you learn from that? I think it served two purposes. So the first was to show that there was a lot of slack in compressing models. And so from a scientific perspective, it was a good finding and a good learning from the community.

Arthur Mensch
It also filled a gap in the efficiency to performance 2d space of models, where there was definitely something missing. Seven B is the size that allows to run efficiently a model on your MacBook or on your smartphone. And we made it sufficiently smart so that it was still useful. So there was already seven B models before, but they weren't good enough to do interesting applications. And so by targeting this specific space, we talked to the developers immediately, because developers like the casual developers running on a gaming gpu or on its MacBook.

So it created a lot of curiosity and adoption because it was a missing spot in the performance to efficiency space. When you look at lessons from that and how it impacts future releases, any that really stand out for you, I. Guess it taught us that there was a lot of interest for efficiency rather than scale. And so that's why we continued targeting very efficient models with the mixture ATX seven B and more recently, mixed hull ATX 22 B, ensuring that for a certain cost and for a certain size, we were reaching the top performance of the market. That has been our major motivation for us to target efficiency well, simultaneously scaling to our larger and larger.

Harry Stebbings
I spoke to Sara Guo before the show, and she said, the core question that I think is, with the focus on efficiency and the efficiency frontier, does scale matter? Well, scale matters in the sense that if you spend more training compute, you can make the models more compressed. So you do need to have some compute to compress models. No, scale isn't the only ingredient to the recipe. You need to scale, but you also need to have proper data, otherwise you reach some data quality limit.

Arthur Mensch
You need to have proper techniques for training. I mean, people call it compute multiplier, I guess. How do you actually make some efficiency gain that are not costing you compute, because computing is expensive. And so one of the things that we do at Mistallis to try and harvest is compute multipliers. Can I ask, in that chasm of efficiency gain without costing more compute, is there much more efficiency we can eke out?

Harry Stebbings
Is there a lot for us to eke out, or are we working really at marginal improvements already? I think it's an open question. I believe there is. I believe we can make models that are much better for a certain size, but it's as open a question as can you make a much better model on the same kind of data by making it bigger and training it for longer things. You need to discover them also on the way.

Arthur Mensch
You can try and predict the kind of performance you reach out you will achieve. At the end of the day, you need to try it out. I mean, it's really much a research field. You need to do the research and you need to try things. So I asked Sam Altman this question.

Harry Stebbings
What is the end state for the model landscape? Most people say, ah, it'll become commoditized, and actually there'll be twelve players and it'll be a race to the bottom. What is the end state for models in your mind? And how do you think about the commoditization question? I think the end state is to have more features on developer platforms that allow us to do customization, that allows to make low latency models that serve a certain purpose, that allows us to evaluate them and to improve them over time.

Arthur Mensch
The model is only like a tiny part. I mean, it's a central part, but it remains a tiny part of an application. And what you want to do across time. And when you deploy an application that you expose to use this, you want to ensure that it works, ensure that its latency reduces over time, ensure that its quality increases over time. And so I think that the end state is, models are effectively going to be a starting point for any AI application developer.

They need to be surrounded by tools, by a lifecycle management platform, basically. And that's the one thing that we started to build. Like general purpose, models are a bit undifferentiated, but the differentiation that you need to create for your application comes from the data you put into it, the user feedback that you gather and the intelligence that you have to figure out what the application should be doing. And that is not commoditized at all. There's no recipe that allows to go from a general purpose model to a model that is super good and better than all of the others at your specific task.

This is a missing piece in the puzzle, and that's one of the aspects where we're putting our strength on the product side. Sam and Brad said the other day that models just aren't actually that good yet, and they need to improve a lot in quality. What are the largest constraints or bottlenecks on model quality today, and what needs to change for them to improve? I think the data quality is a constraint. How do you leverage the entire world knowledge and ensure that the model follows a certain path toward learning more and more complex things?

That's a very important part, and I think it has been a neglected part. There's obviously compute, but given the amount of data you have, we have at hand, compute is already running into is no longer the bottleneck. The bottleneck is more the data at that point, if you look at text to text models. And so the question is, how do you refine the data, and how do you feed very high quality data to the model itself in order to improve it over time? And I think in that setting, it becomes a bit.

One bottleneck that is associated to bringing better model performance is the question of how do you evaluate these performances? You need to have very good evaluation that targets very specific topics, like you want the model to be good at helping diagnoses in hospital, but in French, and oftentimes you're a bit out of domain compared to the data you have. And that's where you should identify a gap and you should try and fill it out. The pushing the model capabilities become also a question of mapping where they're failing and figuring out ways of improving it. For instance, they're failing at mathematics.

How do you improve their mathematic thinking? How do you improve the way they demonstrate theorems? The answer to this is very different from the way you answer the question to how do you improve medical diagnosis. In French, for instance, will we see. Large scale, generalized models that are able to answer huge swathes of very complex problems?

Harry Stebbings
Or do you think we'll see much more vertically specific, smaller, more specific models that are much more vertically aligned? Yeah, we believe that. And actually these vertical models are not going to be out there. They're going to be built by the application makers, because the only way you can make a low latency model that is super good at a specific task is to get rid of the general purpose aspect. Because a general purpose model is bloated, you can think about everything, but if you want your model to think thoroughly about a specific topic so that you can call it in your AI application while maintaining a good user experience with.

Low latency, what role do you play in that world? If it's actually in the application layer where you have that specific model creation, where that kind of value accrues, where do you play in that? But it's a very hard job to make a specialized model. So it's actually very tied to the way you create a pre trained model. And so bringing the tools that allow to do it in a foolproof way.

Arthur Mensch
So allowing developers to create a customized model that are performing very well at their task, but that doesn't require expert AI knowledge, which is hard to find, is definitely something where we're insisting. So I'm an investor today, and I'm pleased that you just said that there will be value accrued at the application layer. I look and I worry that, bluntly, everything is going to get steamrolled by some of the players that we mentioned. How do you answer the question of will value accrue at the application layer? And for me as an investor today, Arthur, you know me, how would you advise me?

There's two opposing directions. The first is that the models are getting better and better. So it means that creating a verticalized application, as long as you have the data for it, and a good understanding of the use case you're facing, is going to be easier and easier if you have access to the tools that facilitate it. So that's the first aspect which would make me think that the application layer is going to grow thinner and thinner. But then there's also the fact that the moles are getting cheaper and cheaper because we managed to compress them, because we make a lot of improvement on their efficiency.

And so that means that effectively, this plus the competitive pressure there is on the model layer, means that the price around the model, the dollar per intelligence unit, let's say, is definitely going to reduce. So there's these two aspects of growing ability, compressed price, which on one side says that the application layer is going to grow thin, and on the other side says that the model part is going to grow thin. So for us, the approach that we are taking is that the model part is still going to be big enough and that we need to build this platform on top of that, because that's where we are going to enable all of the vertical applications that will be interesting for humanity. How do you think about that positioning and brand? Because there are other players who are much more direct in saying, hey, we're going to dominate a lot of different verticals and kind of be afraid.

Harry Stebbings
How do you think about that enabler, two vertical applications or not? In that positioning, we are not a verticalized company. We started Mistral to bring value to developers and to bring freedom to developers. So when we started, there was basically one API out there soon too, and the field of generative AI was starting to look like it would be very centralized around a couple of players. And we took this platform approach where the model that we're making and the technology that we are making, we are allowing developers to own it, to modify it.

Arthur Mensch
And so bringing freedom to developers and AI application makers is, I think, the best way in distributing generative AI as widely as possible, which is our objective as a company. Making AI ubiquitous, bringing frontier AI into everyone's head is the reason why we started. We did a good job at it. But this open source part was, I believe, a good enabler for the community and made people realize that they could build very interesting technology by modifying the models themselves instead of depending on the APIs of a couple of providers. Dude, what do AI developers care about?

Harry Stebbings
Everyone kind of gets on Twitter and goes, oh, did you see X's performance this week is better than Y's performance last week. What do they care about efficiency, scale, cost, what drives that usage and decision making? They care about cost for sure. They care about customization, being able to modify the models at will. And on that aspect, I think we are only scratching the surface of what can be done.

Arthur Mensch
Like the fine tuning aspect that has been like the go to solution is probably a little too low level from what we should be doing. They care about being able to deploy anywhere. So they operate in a certain space, in certain cloud, they might be operating on Prem, they might have some edge devices to deploy to, and they want to be able to put their technology there. And so they also care about portability, which in turn offers data control, usually llms. AI becomes very useful when you connect it to knowledge bases or to anything that is related to certain business.

In that respect, it becomes a very sensitive part of your application because it sees everything, it sees all of the data you have. And so enterprises, for instance, do care about ensuring that procreatite data they have is accessed in something that they can completely secure. And that's the reason why we deployed our platform on Azure and AWS, for instance, that is bringing the security layer that they need. We're going to get to enterprise. Can I just ask, does brand matter in this segment?

Harry Stebbings
When we think about building brand both in terms of developer adoption, brand, corporate brand, is brand a large determinant of adoption? In this segment, brand seems to be critical. And this is something that we have learned on the way people use certain models because they are known to be good. You can't afford to evaluate everything out there. And so having some form of community botching is super important.

Arthur Mensch
The approach we took with APHD distributed models has contributed to what I think has become, well, at least a known brand, and we believe that it's definitely going to be important. Brand is important because trust is important. That domain and open source brings trust in terms that provide some trusted brand. You mentioned the word open source. I'm going to get to that.

Harry Stebbings
I do just want to touch on that. You mentioned cost. Also, I want to touch on cost. How, when and who will make marginal revenue that exceeds marginal cost in LLM based products? You should be telling me, you're the investor, right?

That means I know nothing. Okay. I can tell you who is doing the most, most margin at the moment, that it's probably going to evolve over time. Who's doing the most margin at the moment? Nvidia is at that point, the cloud providers are pretty much at cost.

Arthur Mensch
LLM providers, we are not at cost, hopefully, but the margin that are known to be lower than the typical software margins, AI application makers, some of them, the one that are most used, seems to be doing a pretty good margin. I think it's going to be quite a moving space. As I've said, the capacity of models makes the, the cost of making an application lower and lower. I don't think there's any way in which the marginal cost and the margin of the most important part of that technology, which is really the foundational layer, becomes zero, because otherwise there's definitely going to be a fairness problem. What do you mean by the fairness problem?

Harry Stebbings
Talk to me about that. Usually the value tends to accrue where most of the difficult part is and most of the defensibility is. It was for a while it has been on foundational models. I think it's obviously evolving with time, and there's no moat that isn't disappearing or evolving with time that will remain the part where most of the innovation will be made and where most of the, well, at least a significant part of the accrued value will, the value will accrue. Is there actually much of a barrier to creating a foundational model company today.

I know that's a really broad, stupid question in many respects, but you have so many different players now, and new ones popping up every day. Is the barrier just reducing day by day? I don't think it is. To be relevant in that space is a very hard topic. You need to be dominating on the cost efficiency, on the efficiency performance by toefront.

Arthur Mensch
And there isn't. There's only a few companies that are currently well positioned, so you can try and do something, but if it's not relevant, if it's strictly dominated by another model or another technology, then you have a problem. There's a few barriers that are pretty hard to face that you need to. To raise sufficient capital to have enough compute and be relevant, you need to have people that knows how to train models, which is still a scarce resource. And then you need to have a good brand, because, as you've said, it's highly competitive, and this is not something that comes out of thin air.

So I think there's still a lot of defensibility on the market, although there is a lot of noise, which is different. How quickly does the cost of compute go down, do you think? Because if you look at those things, actually, you said cost of compute, access to talent and brand. If we drastically bring down cost of compute, like many think we will, very quickly, you've got access to talent and brand. Two of those are more doable.

The cost of compute reduces over time, just based on hardware costs. It reduces around 30% every two years if you follow Nvidia roadmap. The other thing that creases the efficiency of algorithms, if you look at the way we train models from three years ago and the way we train model today, I think we have probably made something around 100 times algorithmic improvement. That's probably where most of the gains were actually made in the last three years. Obviously, the cost of compute does reduce, but it doesn't reduce faster than the mobile load.

So our bet is more on efficiency, where I think there's a lot of. Improvement that can still be made, given Nvidia's prominence there. And Nvidia being the one where the gains are, as you mentioned, is bluntly one of the single most important things, not simply the quality of your relationship with the core provider being Asia, or being Nvidia, or being one of these players. Is that not the core determinant of success today? I guess it's an important aspect.

There is a strategic dependency from the AI layer on the cloud providers. And on Nvidia the competition is heating up as well. But it's, it's effectively important. It's effectively useful when you develop a software to also know the hardware provider because they can help you out in optimizing for the hardware. It's useful when you're selling your developer platform to enterprises to bring that platform through their usual provider, which happens to be a cloud provider.

So there's definitely some important collaboration to be made there. When Amazon invests like $2 billion in anthropic or whatever it was, is that not just a trade where anthropic then spend $1.8 billion on Amazon and return it back to them? Do you see what I mean? Is it not a bit of a misnomer? It looks like your round tripping.

Yes. I don't know about that deal particularly, but it makes sense from both perspectives. Can I ask, how does the unlimited availability of open source llms impact the answer to the above being marginal cost and marginal revenue. Does it change much? It moves the value a little higher than the model itself.

It moves the value to the platform and customization part, which is really, I guess, something that we're expecting, and it accelerates that process. You started off completely very open source, very much open to the community. Now you have small models open and then larger ones closed, am I right? We also have large models that are open now. Depends on the threshold for small and large, but the a times 22 b is actually relatively large by any standard.

Harry Stebbings
What was behind the decision then to close some models? Is it just a business case where you need to make money opportunistically? There was opportunity to grow the business using that asset as something that we were selling. It's still the case that we're growing our business on top of commercial models in particular. There's also a good way of cementing some strategic relationships with cloud providers, and it's going to continue to be the case.

Arthur Mensch
Still intend to be a leader in the open source part and to have some unique assets that we can license and to have some unique platform that developers can use. It's hard when you suddenly have some closed and you start building an enterprise team for you as a founder now, how do you think about that balance between a research team and a sales team and making sure that the two cultures come together? Well, I think some important thing is to create empathy. So ensure that the science team also understand the problems that the users are facing. It improves the science because at the end of the day, the general purpose technology we are making is only general purpose if you identify the use cases.

So that comes back to the earlier discussion we had. So ensuring that the science team has some relatively direct exposure to the product and to the business team is actually important to make them understand where the model is failing and how it could be improved significantly. And on the other side, the go to market team has to understand it's a very technical sales motion because you're selling not the product, but you're selling something that is going to power the product. So you need to tell the customer how these things should be used to actually make something that brings value to the business. And that only goes through strong enablement of the go to market team.

So I think it's a challenge. They don't operate on the same scale. The science team has cycles of several months. The go to market team goes faster, shorter cycles, let's say. But I think so far we've managed to recruit go to market people that have some technical interest and technical people that have some business interest.

And I think that's how you ensure that you don't have silos. At the end of the day, one. Of my worries with bluntly this space as we move into enterprise is that brand matters so much in terms of enterprise, actually, and they already have existing agreements with Microsoft. And I worry that actually product or model quality doesn't matter as much as distribution. Microsoft just tack on existing clients with new products.

Harry Stebbings
How do you think about that as a core challenge? And am I wrong to be worried about it? So I think it's true. Distribution is very important. A shortcut to distribution is to create demand through open source models.

Do you think open source is ready for enterprise or do you think enterprises are ready for open source and do they care about it enough? It depends on the enterprises, but some have been early adopters and are using a lot of budget in production. For sure they're ready enough in order to bring them to the next level of putting things into large scale production, etcetera. I think they're still lacking some product around, like managing correctly, load balancing, customizing the models, because you can do it with DIY solutions, but if you want to make it robust enough and scalable enough, it's actually not easy. And if you want to actually increase the quality of the models, the custom models, the recipe are a bit hard to set.

Arthur Mensch
So the most technical sevy enterprises are definitely ready for it. And there's a few. There's actually many use cases that are in production using open source models. Now in order to widen the adoption, there's definitely some tooling to be brought to the market. Obviously, every enterprise today is sitting in a boardroom going, what's our AI strategy?

Harry Stebbings
What do you advise them and what questions should they be asking? Start thinking about how they are going to change all of their products using AI as a premise, using the existence of very clever agents, because you can build very clever agents today, assuming that presence and working backward to understand the consequence in terms of organization, not thinking about generative AI as a way to. As a way of increasing productivity in word processing, but rather as a way to change completely the way you operate your core business, which usually involves taking models and customizing them pretty heavily to create the differentiation that you will need in like five years time when everybody will have adopted the technology in its core business. So my question to you is, you're in France, I'm in London. We both know that european enterprises do not move very fast.

Most do not even have slack today. My concern is that we drastically overestimate adoption in the near future and maybe underestimate it in the ten year, 20 year future. Do you think that's the case here? And do you worry about the lethargy of a lot of enterprises, especially in Europe, in adoption? It's a general phenomenon in tech that you always overestimate the speed, but underestimate the impact.

Arthur Mensch
I think it's probably occurring today. It's slightly different in the sense that there's some executive support for pushing generative AI solutions even in Europe. So there's some delay compared to the US market for sure. I wouldn't say it's very significant. It's one year maximum in terms of delay.

The challenge here is that it's a technology that can take many forms, and so trying to focus on some specific thing that you can bring to the market that have AI in it is a priorisation challenge, and so you need to be very strategic around that. I don't think this is super easy for enterprises generally. It will become easier once they try off the shelf solutions a bit more, once they realize that there are some developer platforms that allow us to do it without hiring very expensive and hard to find AI scientists in house. And so we expect that this is going to accelerate in the coming years. Do you think that we're still just playing in the experimental budget game, or do you think that we're moving into core budgets as well?

It depends. It's moving into core budget for customer support, for instance, where like areas where the application of AI is pretty obvious. It's definitely moving into core budget. It's also at the experimental stage in many other functions and for core applications in the industry, the telecom industry, and in healthcare. This is still in the playground, but I think it's going to evolve in the next year.

Harry Stebbings
Can I ask, as you build out enterprise, it's another expensive thing to build out on top of compute and talent. It costs real money. And I spoke to Paul at Lightspeed before, and he was mentioning to me bluntly how much less capital you've raised compared to a lot of your competitors. Most obviously OpenAI and anthropic. He said, in a world where capital equals compute equals quality of model, how does Michelle keep up and stay relevant?

Arthur Mensch
So the good thing is that capital is correlated to compute, then compute is correlated with quality. It's not completely dependent on it. And as I've said, there's some strong opportunity for providing models that are the best of their class. They might be sufficient to actually solve certain use cases. That's where we're playing in addition to be playing on the scaling part, because obviously you do need to stay relevant.

You need to keep your technical team motivated. And to keep your technical team motivated, you need to give them the experimental bed. They need to make new discoveries and to progress science. And that is where you need computer. In addition to growing the model across time, I mean, we're growing our compute.

Like every company, we are convinced that we don't need to grow at the same rate because there's a lot of barriers that are not compute related, that are appearing on the way and that we're already seeing. So we think we can scale. And we are also convinced that we, on the efficiency front, we are already very well positioned and we are strengthening that position. What are the biggest barriers to mistral today? We've had a few delays with our compute providers.

For sure, that has been a barrier. So the last answer to your question is to be taken with a grain of salt. We are still bottlenecked by compute for sure, but that's because we don't have many of it. We have one five k H 100, which is a few percent, I think, of the capacity of our competitors. And so that's definitely a bottleneck that is going to improve significantly in the coming months.

Harry Stebbings
Can I ask bluntly, was it a mistake for you to not scale that quicker with the benefit of hindsight now, would you wish you'd scaled it quicker? You can't really scale that much quicker because you, you can't raise like 2 billion like on the seed round. I mean, at least you couldn't in 2023, you can today, but you can only hire that fast, you can only scale your infrastructure to manage more gpu's that fast, and you can only raise capital that fast. So there's some acceleration constraints that are pretty hard to fight and that are pretty much the first principles of starting a business. You mentioned about the scaling constraints in cash.

Does it matter where your cash comes from? Does it matter if you have european funded, saudi funded, us funded? Do you think that matters? I guess governance matters. What is important for a young company like us is to be under the control of the founders, because there's a lot of things to be invented and the vision can only be carried by them.

Arthur Mensch
We have very good governance term, a very simple and clean governance that makes us a for profit company. Growing a business to actually push the science frontier. This is something that we're very attached to, being able to control the company, leverage our funding partners appropriately to grow in different parts of the world. In the US, in the EU, it has been critical as well. So it does matter in the sense that we want to have partners that are supportive and long term, because we are in the field that is fast moving, where we don't know yet exactly where the value will accrue.

And so being flexible and being smart is definitely a requirement. When you raise money, would you take. Money from Saudi or China? Good question. It depends on the term.

China is a bit hard for us. It's even hard to operate in China. I mean, we don't operate in China because you can't really operate in US and China without being like a very, very large corporation. And so you need to make some choices. What chance do you think that Europe has in AI?

Harry Stebbings
I know it sounds deterministic and defeatist, and so you might be going, oh, fuck, Harry, shut up. But it's like, what chance do you think Europe has in AI? And what does it take for us to stand up as a serious AI industry with Europe? I guess the chance it has is that it's a revolution. It's changing the way we do software.

Arthur Mensch
And so as every revolution, it opens a lot of opportunity for new actors, and there's no reason why there shouldn't be an actor that was created in Europe that could grow pretty fast. And that's the mission that we gave ourselves. We have the talent, capital can cross oceans without, without too much problem. We have the market. The market is more fragmented than in the US, for sure.

The ecosystem, the digital native ecosystem, is definitely smaller, but it exists and it's growing. There's local opportunity for business development. On the talent side, we can hire 23, 24 years old people that we can onboard in four months. And they operate as well as any software engineer in the valley. So people are quite talented here.

And so if we manage to keep them and to. To convince them not to go to the US, we have a lot of opportunities. When we look at computers, mobile, cloud, the core technology shifts. The way that it's worked is Europe has ceded control to the US and then just taxed us companies for access to our citizens. If one's being defeatist, is it different now?

Raoul, Europe is paying the price of not setting up a VC system in the sixties, but setting it up like 40 years later or even 50 years later. And by the way, the dirty secret is that the VC ecosystem in Europe is US funded. Yeah, it was. I think it is. It still, honestly, in large part, yes.

Harry Stebbings
There's government institutions which are backfilling it, but largely backfilling it with bad players who aren't very good, but the best providers in Europe, largely us funded by top us institutions. Okay. I think, yeah, as I've said, it takes time for an ecosystem to build. So you have layers of entrepreneurs and investors that stack on top of each other. The US has 60 or 70 years of venture capital investments.

Arthur Mensch
I think Europe has only 20 years. I mean, it takes time. It takes an incompressible time to build an ecosystem. It takes also some willpower. And I think now we're seeing that willpower, we are seeing entrepreneurs creating companies, we are seeing VC's like you not going to the US, everything is positive.

It just takes time. I'm adamant that we'll manage to do something interesting. On the engineering side, do you feel like you have the depths of talent pool to hire from as you scale? You do. On the engineering side, on the AI side, we do.

We have a team in the US, though, which is working on specific topics, like for senior AI scientists, you find them more in the valley than in France. For Junior AI scientists, there's a wealth of talent in France, in Poland, in the UK. I think one of the strengths of. The area when you were raising money, was it very different speaking to european investors versus us investors? I guess in the seed round, no, it wasn't that different because it was a seed round for the Series A, which was a bigger round.

It was. European funds weren't structured to do the kind of deal that we were proposing. We didn't even have a lot of conversation because they just couldn't get their head around the investment that needed to be made. Whereas we were a pro of new company. Yeah, I think what is lacking, and it's related to, to the ecosystem part in Europe, are growth funds that are able to take huge bets with lots of conviction.

And that in turn should improve over time, especially if we manage to use european wealth and channel it more into that growth funds than it is today. I think you have more hope than me on that one. That is not going to happen. We are not going to see many more european growth funds be built in the next few years, for sure. Not in the next three to five.

Hinges on a few political decisions. I think it hinges on supply of capital and belief in a future european ecosystem that can contend with other large ecosystems. It's a chicken and egg problem. This could be nudged into the right direction if politics wants to do it. If a couple of companies show that you can actually have companies that grow fast in Europe, and that's what we are trying to do.

I'm not too pessimistic. I find you too pessimistic. You should come to France. I think you would get more optimistic. Do you know what?

Harry Stebbings
If a Parisian is telling me that I'm too pessimistic, then, shit, I really need to be more optimistic. My question to you is, like you mentioned that the speed of scaling, hardest thing, dude, is scaling with your company at the same speed. What was the hardest thing about yourself scaling as CEO with such speed of scaling of the company? I mean, we are learning on the job. It's effectively.

Arthur Mensch
You have organizational challenges. How do you, how do you ensure that 45 people communicate well together? How do you manage your time in terms of representation time in some of business development time? Because we're still at the stage of the company where we get involved a lot in the deal making aspect. And how do you ensure that you set proper directions and maintain the team in a state of tranquility despite the amount of noise that there is on the competitive side?

The fact that direction is obviously going to be changing over time, because there's a lot of uncertainty in that field. So this is, I think, the hard part. I don't think I'm doing it properly, but we are actively trying to find sources of information to learn new things. Let's say if you could call yourself up to like, the night before you became CEO and founded Mistral, and give yourself some advice with the now knowledge that you have, what would you say to yourself, Arthur? Maybe stage a bit more the product development and go to market development.

We did start the go to market motion at the time where we had absolutely nothing to sell. It did work out. It did create some brand awareness despite the absence of anything. I think it might have been slightly simpler to stage things, maybe a little more developing the product a little before developing the go to market. But since it's such a fast moving field, that we did start everything a bit together with some organization that was a bit lacking, and now we are solidifying it on the fly.

It has worked out it hasn't been optimal for sure. And so in hindsight, you can always give you. I could give me like a few tactical advices on who to hire when generally, I think the strategy we had one year ago hasn't changed much. We did realize that we need more capital and that we could not operate only from Europe, and that we needed to go to the US very quickly. Those were findings that we did on the way.

I don't think it would have helped that much to know it a year ago. One do you feel like you have enough cash now? I guess the startup is always fundraising. It's a field where for the years to come, the investments are going to exceed the revenue by design, because you do need to scale and you do need to stay relevant as a frontier company. So effectively, there needs to be some investment.

The revenue is ramping up, so there will be some, some revenue to reinvest. But today, and for the years to come, the speed for developing research should be faster than the speed at which you can develop your go to market. Before we do a quick fire. When you look at the landscape today, which competitors do you most respect and admire? I mean, they all delivered.

We were surprised by cohere recently came up with new, good models, and I think that was a surprise for us. And obviously OpenAI and entropy and my friends at Google are also doing a good job. So it's a competitive landscape and we respect all of them. We also all work in the same direction and eventually with the same higher goals. So it's great to have respect for one another.

Harry Stebbings
Is it too late to start one now? We see like holistic starting now. Is it too late? Holistic? I know them well.

Arthur Mensch
Is it too late? I wouldn't recommend going into the foundational layout business. I know Sam didn't recommend to do that one year ago, and we did, and it seemed to have so far change a few things. So I think it would be arrogant for me to say that there's no chance for a new competitor to rise and beat us. Listen, my friend, I want to move into a final thing, which is just a quick fire.

Harry Stebbings
So I say a short statement, you give me your immediate thoughts. Does that sound okay? Yeah, let's do that. So what worries you most in the world today? Global warming.

Arthur Mensch
There's the race of the planet heating up and us finding solutions for it. I think AI is part of the solution, brings more control, it brings potentially more efficiency in some of the processes, but there's effectively a race for survival. So I think this is something that we should be a bit more aware of. What have you changed your mind on most in the last twelve months? I think I've changed my mind on a lot of management premises that I had and that I had never tested in real.

Harry Stebbings
What was the biggest one? Transparent feedback is actually super useful for a company. And so operating in almost fully transparent manner has helped out growing without breaking. What element has been the most unexpectedly challenging in the scaling of mistral? The amount of demand that we had to manage, which is too high for what we can handle.

Arthur Mensch
The brand success, the fact that people knows us was a bit unexpected. We knew that it would be noticed. We had no idea that people would start using us that fast. What do you do to calm down? You have a lot going on now, Arthur, and you have a lot of expectation and cash on your shoulders.

Harry Stebbings
What do you do to just. I run, I cycle. I think my partner will yell at me, but I try to take care of my daughter. Okay. You've recently become a father.

What do you know now that you wish you'd known when you first had your daughter? Very recently, I had no idea that. You needed so much energy to care for small children. Where do you think AI will take the world in the next ten years? What does the future of society look like in a world where AI is embedded into everything?

Arthur Mensch
Well, it's changing the way people work significantly in the sense that it requires to be more creative and to bring more value beyond what can be automated. So it's a very structural change on the job market, which means that there should be some adaptation that are taken pretty quickly in training, in education, so that people can get a sense of what is going to be expected from them in their daily job. Assuming that there's some AI out there. Do you think the fears of job replacement are grossly over exaggerated? I think they are.

I mean, depends on who you're speaking to. I think jobs are going to be displaced for sure. Some will be replaced, some will open up. We're just trying to move humanity to a higher level of abstraction so we can now talk to machines and machines can understand and answer in a human like fashion. This is not so much of a bad change compared to what we were doing with computers.

I think what's happening right now is that probably the speed in our elevation toward higher abstraction level is probably occurring at an, at an unmatched rate in history. So that means that the society adaptation is going to be more challenging and needs to be anticipated. Final one for you. We do a show in 2034, ten years time. If everything goes right, where's Mistral then?

Mistral has some very relevant models, commercial and open source, and it has a very strong developer platform that allows to do everything that you need to create your AI application. That would be a good achievement. Arthur, listen, I've so enjoyed doing this. Thank you for putting up with me going in many different fast moving directions. You've been incredibly patient and a brilliant guest.

Harry Stebbings
So thank you so much, my friend. Thank you for hosting me. What a fantastic guest to have on the show. I want to say a huge thank. You to Arthur for being so patient.

With me there and for being so. Open with some of those answers. If you'd like to see more, you can of course find it on YouTube by searching for 20 vc. But before we leave you today, we're all trying to grow our businesses here. So let's be real for a second.

We all know that your website shouldn't be this static asset. It should be a dynamic part of your strategy that really drives conversions. That's marketing 101. But here's a number for you. 54% of leaders say web updates take too long.

That's over half of you listening right now. And that's where Webflow comes in. Their visual first platform allows you to build, launch and manage web experiences fast. That means you can set ambitious marketing goals and your site can rise to the challenge. Plus, webflow allows your marketing team to scale without relying on engineering, freeing your dev team to focus on more fulfilling work.

Learn why teams like Dropbox, ideo and Orange Theory Trust webflow to achieve their most ambitious goals today@webflow.com. And speaking of incredible products that allows your team to do more, we need to talk about secure frame. Secure frame provides incredible levels of trust to your customers through automation. Secure frame empowers businesses to build trust with customers by simplifying information security and compliance through AI and automation of fast growing businesses, including Nasdaq, Angellist, Doodle and Coder Trust Secure frame to expedite their compliance journey for global security and privacy standards such as SoC two, ISO 2701, HIPAA, GDPR, and more. Backed by top tier investors and corporations such as Google, Kleiner Perkins, the company is among the Forbes list of top 100 startup employers for 2023 and business insiders list of the 34 most promising AI startups of 2023.

Learn more today@secureframe.com. Dot really is a must. And finally, a company is nothing without its people. And that's why you need Remote.com dot. Remote is the best choice for companies expanding their global footprint where they don't already have legal entities.

So you can effortlessly hire, manage and pay employees from around the world, all from one easy to use, self serve platform. Plus, you can streamline global employee management and cut HR costs with remote's free HRI s. And hey, even if you are not looking for full time employees, Remote has you covered with contractor management, ensuring compliant contracts and on time payments for global contractors. There's a reason companies like GitLab and Doordash Trust Remote to handle their employees worldwide. Go to remote.com now to get started and use the promo code 20 Vc to get 20% off during your first year.

Remote opportunity is wherever you are. I so hope you enjoyed that show. As always, it means the world to me that you listen. You can check it out again on. YouTube by searching for 20 vc.

And stay tuned for an incredible episode episode this coming Wednesday with an OG of the venture space, the one and. Only Mark Susta at upfront ventures.

Harry Stebbings
Only Mark Susta at upfront ventures.