Theory to Code: Building the Breakthrough zkVM Jolt

Robert Hackett

Hello, and welcome to Web three with a 16 Z, a show about building the next era of the Internet by the team at a 16 Z crypto that includes me, host Robert Hackett. Today's all new episode covers a very important and now fast developing area of technology that can help scale blockchains, but that also has many uses beyond blockchains as well. That category of technology is verifiable computing and specifically snarks. So today we dig into zkvms, or zero knowledge virtual machines, which use snarks, and we discuss a new design for them that the guests on this episode helped develop, work that resulted in jolt, the most performant easy for developers to use. ZKVM to date, the conversation that follows covers the history and evolution of the field, the surprising similarities between snark design and computer chip architecture, the tensions between general purpose versus application specific programming, and the challenges of turning abstract research theory into concrete engineering practice.

Our guests include Justin Thaler, research partner at a 16 Z crypto and associate professor of computer science at Georgetown University, who came up with the insights under pinning jolt along with collaborators from Microsoft Research, Carnegie Mellon, and New York universities. He's the first voice you'll hear after mine, followed by Sam Ragsdale, investment engineer at a 16 Z crypto, and Michael Xu, research engineer at a 16 Z crypto, both of whom brought jolt from concept to code. As a reminder, none of the following should be taken as tax, business, legal, or investment advice. See a 16 zcrypto.com disclosures for more important information, including a link to a list of our investments. So the benefit of a blockchain is that everybody in the world has access to this computer, and therefore has both the incentive and potential ability to attack the computer.

Justin Thaler

And yet they can't. So despite the blockchain being an incredibly expensive, incredibly weak computer, it's still a useful computer. It's a computer that accomplishes something like no other computer in the world before bitcoin launched in 2009, or whatever year could achieve. Anyone in the world has maximal incentive to attack that computer and convince the computer that they have a billion dollars when they don't, and the blockchain can withstand that. So it's a very slow, very expensive computer that's nonetheless useful because of this guarantee.

Sam Ragsdale

And you said a weak computer, but you're referring specifically to processing power there because it's very strong in certain respects, which is the ability to make commitments around who can tamper with it, and the fact that it's going to run as intended. I'd say it's very secure. The computer works in the world's most adversarial possible setting, but that comes at a cost. And that cost is, it's very difficult to get this computer to do a lot of processing as a result. And so snarks in general, and zkvms in particular, have a lot of potential to let that computer do more work than it can do today without sacrificing the sort of security decentralization property.

Justin Thaler

And the reason being that you can have anyone in the world totally untrusted adversarial entities do a lot of the processing and merely prove to the world computer, to the blockchain, that the processing was done correctly. Okay. So even though no one in the world trusts this entity that's doing the processing well, they don't have to. So a snark is a cryptographic protocol that lets the untrusted entities come up with a very short proof that they did this work correctly. The proof is very short and very fast to check.

It might be just a few hundred bytes and take just a couple of milliseconds to check. And that's what the s stands for in snark, too. It's succinct, right? Yes. Succinct means exactly.

The proofs are short and fast to check. And so the world computer, the blockchain nodes out there, only have to store those proofs and check the proofs. They don't have to do all the work that the proof is proving was done correctly. Now, what is a Zkvm? It is a particular kind of snark, where the statement being proven in a ZkVM is in the form of a computer program expressed in the assembly language of a virtual machine called a vm.

That's just a simple computer. And to unpack the implication of that technical definition is it's a very usable snark, because developer who wants to deploy a zero knowledge proof doesn't have to know anything about how the snark itself works. So they can just write the computer program the same way they would if there was no snark being done. And the ZkVm just lets them generate proofs, lets anyone in the world generate proofs that they ran that program correctly. So the developer just has to know how to program it.

Doesnt have to know anything about the snark. Can I dumb it down a little bit? Please make it as dumb as possible. Okay, cool. Im good at that.

Sam Ragsdale

Assume zero knowledge. Zero knowledge. Perfect. I think the way to think about it is decentralized systems in general are ones that push the authority systems to the edges. Blockchains are very good examples of these.

Michael Xu

The way that you keep the entire world computer in check, that everybody can trust, is that all of the users, all of the nodes on the network, check all of the work of all of their peers. What this literally means, in the case of Ethereum, there's about 600,000 nodes, number jumps up and down on a daily basis, but around 600,000 nodes, they run this single threaded process where they check all of the work of their peers and make sure they're being honest. And if they ever see that their peers are being not honest, there's like a dispute resolution strategy such that everybody can continue participating in this computer and know that all of their peers are being honest. As a sweeping statement of what that gets you, that gets you, in Chris Dixon's words, computers that can make commitments. A computer that you can build on in the future, and you know that it'll have the same properties tomorrow as it does today, at least with fairly specific guarantees about economic and compute security.

But if you listen to what I said, I said that 600,000 computers do the exact same work over and over again. Like, from some non technical but utilitarian perspective, you should see that as inefficient and a problem. And so, like, can we do better? And before snarks, the answer was sort of no, right? Like, the only way that you could get a computer with this property is if everybody does this redundant work.

And that's sort of like your ticket to ride in the network ticket to get these properties. But what that means also is that the entire throughput of the network, ethereum in this case, or Solana or bitcoin or whatever, is limited in throughput. The whole thing by the minimum computer that you want to support. Ethereum makes this limit rather low. Solana chooses to make the limit higher.

But still, what they're doing is the same. They're checking the same program over and over again. And so a fair question to ask might be, do you have to pick one? Do you have to pick either the decentralization of the network and allow a lot of people to participate with a lot of different diverse hardware, or get good throughput? Until now, the answer been yes.

But starting in 2016, people have proposed snarks as a solution. And the idea of a snark is that you can take the validation program itself and you can basically compress it. You can make it much smaller by doing a lot of additional work, applying a snark to it. And then when you distribute the proofs, thereafter the validators have to do a lot less work, but the trust assumption hasn't been lost. So the prover on their behalf can do the validation program, which compresses it, creates a proof, and then the validators at the edges of the network can run the proof instead in a lot less work than actually running the validation program.

But there's been a lot of issues with these. And let me just clarify too. You posed this tension, this tension between decentralization and centralization for a given blockchain network, and also between how efficient it can be. And maybe beforehand there was this idea that in order to achieve real efficiency, you needed to be more centralized, whereas decentralization was going to hurt you and you were going to pay a cost for that. But now we have this new way where it seems like the technology is enabling us to have a middle path where you can actually have decentralization, but it can also be quite efficient, hopefully.

Yeah, that would be the dream. But I think theres a bunch of issues with snarks when they first came out. I mean, they're a new technology like any others. Michael, do you want to talk about maybe like the three main issues that snarks ran into starting in 2016? Yeah, I mean, for like a third perspective, I kind of think of snarks as existing on this continuum.

Arasu Arun

On one end you have like Sigma protocols, which are these like super restrictive, but also very efficient snarks. They can prove these very specific algebraic relations between numbers, but, you know, to massage that into something practically useful is difficult. And then on the complete other end of the spectrum is ZK vms, which are fully flexible. You get all the devx benefits of having a full vm. But the flip side of that is that they tend to come with a bunch of performance overhead and a bunch of complexity, which has historically been a barrier to their adoption.

Sam Ragsdale

When you say Devex, you're basically saying that developers like having this virtual machine environment because it's very flexible, it's generally programmable, and they can do with it what they would like. There are fewer restrictions. Yeah, exactly. So you can just write programs in whatever high level language you'd like, as long as it compiles to the particular vm bytecode supported by that ZKVM. And I think particularly the other programming style was this algebraic circuit thing, which required not only different programming languages, but also a fully different programming model.

Michael Xu

I tried to learn a few of these as well and wrote some circuits in them, and they're just highly unwieldy. I analogize zkvms to what happened with JavaScript, where JavaScript was one of the worst web technologies. It's slow, it's interpreted, there's this whole mess that happens there. But because so many people know how to use it and it's so easy to learn, it means that people are much more productive in it. People are writing server backends in JavaScript or typeScript today, despite it being the least performant language to do that.

Just because the Devx is so much better and more people can contribute to a large codebase, I think the same thing ends up happening if you say you either have to make snarks in algebraic circuits and they'll be fast, or you have to make slow snarks in a ZKVM, but you can use rust or python or whatever. I think the benefit of Jolt is the ZKVM isn't that much slower. You probably pay some costs for doing that rather than hand wiring it, but that is well overcome by the devx improvements and the productivity improvements of your engineers. I would add that in the snark setting, developer experience is not just a developer experience issue, it is a security issue. When you deploy a snark, the prover proves it ran a computer program correctly on some data in blockchain settings.

Justin Thaler

Often what that computer program is doing is it's validating a bunch of digital signatures, testing that the transactions were actually authorized by the entities initiating the transaction. That's what prevents people from being able to just take money out of your bank account on the blockchain is the fact that you have to sign the transaction. They can't just put it out into the world computer like, oh, send all this person's money to me if that person didn't authorize it. Snarks are cryptographic protocols, and they're used to protect valuable assets. So if the computer program the prover proves it ran is not the computer program that the developer thinks it is, then the prover is attesting to having done the wrong thing, having not actually validated those digital signatures, and that is what can lead to billions of dollars of losses.

In fact, even today, these snarks are ostensibly securing billions of dollars of value. They're very complicated, especially if you're handwriting circuits. These circuits always have bugs in them, and that means they're just insecure. Well yeah, humans write bugs in JavaScript, but it's a lot easier to write bugs into arithmetic circuits that look like something like hand wiring a computer from the sixties. I think with zkvms, not only is it just nicer for the developers, but it pushes the surface area for bugs into a more constrained place.

So just imagine we live in a world, and I hope to live in this world a few years from now where the jolt ZKVM has been formally verified for correctness, whatever that means. So we just know there are no bugs in the proof system itself. Okay, that means that whenever somebody wants to use jolt, apply it to a new computer program that's validating some new digital signatures or something. That means all we now have to, like, the only surface area for new bugs is in the computer program the developer wrote. If the developer wrote that computer program in a high level language like rust or something, we should have a lot of formal verification tooling to bring the bear on, making sure that that computer program matches a specification that the developer intended or whatever.

It's just much better if all you have to do is formally verify rust program. We showed that we don't need new tooling for formal verification. The code is easier for a human to audit. Everything is just much better. The big problem today, standing in the way of this, other than, you know, the three years of effort it would take to like get high assurance that our jolt implementation is actually bug free, is performance.

Okay, so these kvms, just the prover, proving it correctly around the computer program is just vastly more expensive than just running the computer program with no proof. So I say the two issues today with zkvms and why, you know, they haven't already taken over the world is proverbs. Super slow, and they're very complicated code bases, and it's going to take years of work to make sure that they're bug free. So, speed and security, how do we tackle both of those problems? Well, we tried to take major steps towards both with Jolt.

Now they are still only partial steps, but we think that jolt has introduced new techniques that not only speed up the prover, but also substantially simplify the implementation. Sam and Michael can say more, but I think the entire jolt implementation is about 25,000 lines of rust code that includes the prover implementation. I think the previous zkvms were more like 50,000 and up even 100,000. There's a lot more performance, at least, if not also simplicity to squeeze out of this. We're just getting started on that front.

Michael Xu

I think that starting with the security side, the way that I would think about a ZKVM is it allows you to program in high level languages. Some zkvms have a custom subset of the languages. Others, target Risk five, for example, will allow you to program in any high level language that can be compiled down to RISC five. RISC five is an open source instruction set used for running computer programs on chips. It's an alternative to X 86 or ARM or any of the like X.

Sam Ragsdale

86 being what intel pioneered and is kind of what runs PCs and data servers. Yes, exactly. And specifically, RISC five can be targeted by LLVM, which is the backend for most of the high level languages we know and love today. Which means that most high level languages can eventually target RISC V, just as they can target X 86 or arm, et cetera. RISC five has generally been chosen because it is a relatively small instruction set.

Michael Xu

So the variant that we currently support is Rv 32 I. That's about 50 instructions. The equivalent in arm is something like 550 instructions. And that just means there's some fixed engineering cost that has to be done by the ZKVM developer per instruction. And so the smaller the instruction set, the shorter the build time.

Sam Ragsdale

So what is the reason for going with RISC five? Is this a bet that RISC five is going to be, like the predominant chip architecture moving forward? Definitely not. It's simply that the RISC five instruction set is very small. There's a fixed cost that you have to do in engineering time per instruction, and as a result, the smaller the instruction set, the shorter time to market, and the less surface area for bugs, honestly, as well.

Arasu Arun

Yeah, Rv 32 I is like the smallest instruction set that you can compile. High level languages, too. We briefly looked at, like, WASM and MIPs, which are also candidates, and some ZKVM teams are targeting those. But even those are more complex than the rv 32 I instruction set. And then the other consideration is that you have the high level language support, as Sam was saying.

So languages like go and rust are able to be compiled down to RISC five instructions, which is ultimately achieving the devx that we want to keep going. With the security point, though, zkvms allow you to write snarks of programs written in normal high level languages, it's easier for a developer to write a correct program because they're not writing in some strange programming model. But that algebraic model still lives inside of the prover and the verifier of the ZKVM. So still, in order for most ZK vms to function, the ZKVM developer has done the gross work of making a snark and theres still a chance, an equally high chance, that that gets screwed up. The simpler the internals of the ZKVM are, the more likely you are to be able to trust it and the higher security it has.

Michael Xu

Most of the other ZKVM work that is done is based on starks. They end up looking quite similar under the hood, whereas Jolt uses Lasso, which was launched with jolts paper in August 2023, and it uses lookups for pretty much everything, which we won't get too into the weeds on that, but it basically means that the programming model for the underlying VM, not sort of the one that gets exposed to developers, is much simpler. And so rather than again programming in these weird constraint systems and algebraic circuits and all of that stuff, you end up writing about 50 lines of rust code per VM instruction. So in this case per RISC five instruction, rv 32 I instruction. And our thought, our thinking is that that simplicity, beyond the simplicity of the high level language, rust or go or whatever, that can be compiled to rv 32 I, the simplicity of the underlying snark also allows you to trust this thing substantially more.

Arasu Arun

Yeah, and maybe just as an anecdote, a couple of weeks ago we realized that we wanted to do Lasso lookups for the RISC five store and load instructions that required implementing those lookups. And it ends up being like five lookups that we had to implement. I was able to crank those out in one or 2 hours and be pretty confident that it was working correctly. We have these rust macros that basically lets us do a light fuzz test comparing the implementation of the lookup table against, effectively a native rust implementation of that instruction. There's no new abstract programming model such as algebraic circuits or errors or any of this stuff.

Michael Xu

Pretty simple, fairly hard to screw up rust code. Tried and true. I want to talk about Jult from the inception of the idea all the way through to the execution. Maybe we can talk about the intellectual underpinnings here, where the ideas that would manifest themselves as JaLt actually first came from. I can give a brief overview on this.

Justin Thaler

So my entire research career in snark design, dating back to 2009, basically has been focused on a certain technique called the sumcheck protocol, which is due to Lund, Fortnite, Karloff, and Nissan. Some of the very, very early work on verifiable computing from the late 1980s, early 1990s. So some of the earliest, most important theoretical results on this technology led to the subject protocol rate out of the gun, right? A lot of my work has been recognizing the power of this protocol and implementing it and making it practical. And ever since then, I've been trying to get people to appreciate its power and use it, and it's been a bit of an uphill battle.

Sam Ragsdale

Justin, what did you see in the sumcheck protocol that other people overlooked or failed to appreciate? My perspective is that a key determinant of prover costs in snarks is the amount of cryptography the prover has to do. Okay, so in these snarks, the prover commits to a lot of data. What that means is the prover would like to send a very big proof to the verifier, but that would violate the sickness and make verification very expensive. So that's not, okay.

Justin Thaler

So instead it uses cryptography to not send all that data to the verifier, but make it as if it did. Okay. That's what commitments are all about, and that's where you get the sickness from. Okay, cryptography is expensive. It is slow to commit to all that data.

On top of that, the more data the prover commits to, the prover later has to prove to the verifier that all the data committed to was well formed, that it actually committed to the data it was meant to commit to. That's my perspective. Cryptography is expensive. You want to minimize how much cryptography you have to use in the snark. So you're actually able to achieve these speedups and this concision because you're removing cryptography from the equation.

Sam Ragsdale

Theres this give and take between the provers and the verifiers and how much computation each has to do. And cryptography is this funny little subset of computer science where you want everything to run really slowly, but youre like, okay, lets strip that out and actually not rely on those. You want any successful attack to be very slow, but the honest prover, you want to be fast. Despite that, its still the honest proofer. The more cryptography it uses, the slower the honest prover is the subject protocol.

Justin Thaler

I view it as a way to force the prover to do work so the verifier doesnt have to do the work, but without using cryptography. So its cheap, its fast for the proverb, okay, you cannot get a snark without some cryptography. So I view the subject protocol as a way to minimize the amount of cryptography you have to use to get a snark. But the reason I was captured by it was just, it's simple, it's beautiful, it's powerful, it just blew me away. I first learned about it in a course as an undergrad, and I'm so fortunate.

There was a complexity theory book by Sanjiv and Boaz Barack that was just coming out then. Now it's like the standard for the field, and they put this pretty advanced concept into the book, and it's just beautiful. That's great. We'll put a link to that book in the show notes. Sounds good.

Anyone who wants 400 pages of complexity theory, but definitely recommend it. So, when I got to grad school, I started trying to implement the protocol and make it useful and fast in a particular context. It wasn't until much later that we fully understood that you could get a snark by taking this sum check protocol, which has no cryptography, and combining it with commitment schemes, which are cryptographic, and put them together, and you get a snark. And as a result, I think I had a different perspective on the techniques in the area as it developed. And I've read every snark paper that's ever been written with the subject protocol in my head.

And very, very frequently, it's the case that the paper can be reduced to a few sentences of the form, like, invoke the subject protocol if the authors had the subject protocol in their head. It gives you a very keen appreciation for the power of this protocol that you can maybe only get by always trying to apply. It's the one hammer you reach for every single time and seeing, oh, everything's a nail for this thing. People who were designing snarks just were not aware of it. They had not studied complexity theory to the degree that you had so many.

Of them, had not all of them. I think as snarks were becoming practical, people came at it from several different angles. But certainly, I was more aware of it and its power than most. Okay. And I think this is largely because the way the literature developed, it took a funny path.

So, first, people introduced the notion of interactive proofs in which there's no cryptography allowed. The security notion required is this information theoretic security, where it doesn't matter if the proofer has trillions and trillions and trillions and trillions of years to try to trick the verifier, it still can't succeed. And this is sort of mimicking, like, two mathematicians who one is trying to convince the other that a theorem is true, but doesn't want to write down the theorem line by line. So instead, the skeptical mathematician is asking questions and getting responses and asking more questions and getting more responses. That's an interactive proof.

No cryptography. That's where the sound check protocol was developed, and that initial setting that we were looking at. And then after that, people considered a variant. Sounds kind of funny, but where there's more than one prover. So here the typical analogy is you take two suspects in a crime and you don't want to interrogate them together, because then they could try to coordinate their story on the fly or something.

As they both hear the question, they can figure out how to respond to it together, some coordinated. So you put them in two different rooms and interrogate them separately. And when you do that, the proof system becomes more powerful, it turns out, because the two suspects can't coordinate with each other as they learn what the interrogation actually is. Anybody who's familiar with a police procedural drama is going to. That makes sense, too.

Yeah, it's funny. The analogy, I think, is not quite perfect, but it gets the gist across anyway. And then the final sort of thing considered is what people call probabilistically checkable proofs. So here you go back to the notion of a proof that's written down. There's no interaction, there's no back and forth between the verifier and the prover.

The verifier is just going to ignore almost the entire proof. It's only going to read like three symbols of the proof. So the proof might fill up a whole book, but the verifier is just going to turn to three random pages, read like one letter on each page, and decide if the proof is valid or invalid from those three letters. So it's insane that you can actually prove complicated statements with such a restricted verifier, but you can. And these probabilistically checkable proofs turned out to be, like, really fascinating.

They actually led to an entire area of theoretical computer science called hardness of approximation, which has nothing to do with snarks and verifiable computing and all of that. And theorists just went crazy for them, as they should have. And it turned out that the first snarks that people developed were kind of a variant of pcps. That made them much simpler, in a sense, and the simplicity came at a price. We don't have to get into that.

And to remove those downsides that came from the pcps, people have been gradually working their way back towards the interactive proofs. And so in my view, like, the working their way back has been sort of partial. And to make it complete, you actually want interaction, both in the interactive proof, the non cryptographic part of the protocol, and in the cryptographic part of the protocol, the commitment scheme. So I think it's already become pretty popular and well accepted that it's useful to have interaction in the commitment scheme, but the subject protocol is the final step to get the interaction also actually, quite a bit of interaction into the non cryptographic part. So that's the subject protocol.

Sam Ragsdale

It makes me wonder how many other advances are just lying out there to be had just from the original literature. But people have delved into such other niche areas sprouting from it. Yeah, it does make you wonder. And, of course, the intellectual lineage here is all entangled because there was a path in one direction and a path in another, and people pick ideas up from one line of work and apply it to another. So it is all entangled.

Justin Thaler

But, yeah, it kind of reminds me. Of the development of the field of AI, where you've got, like, a seminal paper in the 1940s that kind of lays the foundation of neural networks, but it isn't realized until very recently. Yeah, and I think that's accurate. And this is quite common in all sorts of scientific and intellectual disciplines, where things come back around. Some idea kind of goes out of favor for, like, 30 years, and people think it's just not the right hammer.

And then someone has a new site that addresses the issues, and boom, this method strikes back and is at the forefront again. This was a little bit of funny thing where I implemented the sumcheck protocol back in probably 2010, maybe even earlier. 2009, we used that code, Michael and I, when we were learning some check and learning lasso back in the day. We used Justin's 2010 C single file code as our starting point for writing Lasso. Oh, I didn't realize that was from 2010.

Arasu Arun

I thought you'd like reimplement it or rewrote it recently. No, I haven't even looked at that code since I graduated. I think I was amazed you were able to find the files, actually. Yeah. Yeah.

Justin Thaler

They'd probably be best lost to history at this point. Sounds like they were valuable, though. They were valuable. It was also just like, you get an insight into, like, you know, how this stuff was come up with and how Justin thinks and all that, but it was entertaining that we. It came from a single C file.

Michael Xu

The field was in there, the sum check was in there. There was an implementation of the vectors in there, all that stuff. In 2012, we put this on CuDA and the single Nvidias. Yeah, I guess this would have been pretty early CuDA days. And I think I had implemented a beautiful protocol called the GKR protocol.

Justin Thaler

After its authors, Goldwater, Kalai and Rothbloom, and I had one c file that was like 2000 lines implementing the protocol. And then to get it to run on Nvidia, it grew to 6000 lines all in one cc file or something. So you guys are lucky you didn't look at that, that's for sure. So I've been advocating for the power of this protocol and trying to explain this perspective that it's really a hammer to minimize the use of cryptography. And cryptography is expensive, and it's just been hard to get people to see this, I guess.

I don't quite know. It didn't catch on. And over the last, I don't know, 18 months or something, it really has caught on. And that's been both gratifying and exciting. People are seeing this now, so that's great.

And so sorry. 20 minutes ago, you asked me a question to come back to that, the lineage of jolt. All right. There's an important primitive that jolt builds on, which is called a lookup argument. In particular, the lookup argument Lasso, which we introduced last year.

Lasso came about because people who, not using the Suntruck protocol, had recognized the importance of what are called these lookup arguments and developed very, very sophisticated and clever lookup arguments not using the Semcheck protocol. We've been talking about this, about lookup arguments. I just want to ask, if I were to distill this down to its most basic essence, something that somebody in elementary school could understand, is this effectively times tables for snarks? I'm going to memorize five times six and five times eight or whatever so I don't have to keep computing it every time. I can just instantly serve up the answer.

That's a pretty good analogy. Yeah. Maybe logarithm tables are another analogy. I figured I'd go even simpler. I think that's perfect, actually.

Sam Ragsdale

Okay, cool. You write it all out once, and then you just look at it later. One of the additional cool properties of it is that when you look at it, you can check that it's right. Like a human can check that it's right very easily. But the one place where the analogy sort of fails, and this is really important in Lasso and jolt, is that in many settings in the lookup arguments, no one ever has to ever write down the whole table.

Justin Thaler

So you sort of only pay a price in terms of performance for the lookups that actually happen, for the stuff that's actually looked up. So it's like if you have two numbers that you could have wanted to multiply together, but just never multiply them together. You just wouldn't pay anything for that. So it's not that you're showing the solution, it's just that you're proving that you applied this methodology to put these two numbers together and spit out some answer. So I'm going to try to give a very actually precise, technical answer that I hope is also a little bit intuitive.

Actually, I'm planning a blog post on this. This is a perspective that I think will be novel, even to very technical people who study these snarks, too. So a typical snark that doesn't use a lookup argument, if the prover wants to prove it evaluated this function in our example here, this function just takes its input two numbers and multiplies them. But in general, it might be slightly more complicated than that. If the prover wants to prove that it computed this function correctly, most snarks will make the prover actually prove.

It ran an algorithm to compute that function step by step. So it's like the prover will prove to multiply these two numbers. It actually ran this procedure to put them together and get the product. Yeah, it added eight plus eight plus eight plus eight or whatever to like five times eight. Yeah.

Or, you know, it went through digit by digit and did the grade school like multiply these two digits together, compute the carry digits. So it's forcing the prover to run a particular algorithm to get the right answer. So it's like you're tying the prover's hands very tightly, you're making it run this very particular procedure to get the right answer. And it's secure. It has to run that procedure to generate a correct proof, but that comes at a cost.

It's like now the honest prover, its hands are also tied kind of tight, so it's not free to move as much as it would like. So it's just slower. Right. With the lookup argument. And this only turns out to work well.

If you evaluate the function many, many, like a billion times or something, or a million times or something, the prover is free to compute the answer any way it wants. Okay, so you're not tying the prover's hands as tightly. All the verifier cares about is that it gets the right answer. It doesn't care how the prover got the right answer. And as a result, the honest prover can be more efficient.

Because its hands are free, it can move in a less restricted way to get to a convincing proof. And so that's the benefit of lookup arguments over the traditional way people designed snarks before they were using lookup arguments. And then the connection of lookup arguments to a ZKVM like Lasso is the following. What does a vm do? A virtual machine, a cpu.

These are all words for the same thing. It repeatedly executes the fetch decode execute logic of the simple cpu. This is just how people design cpu's in computer chips. The computer program runs for many, many steps, but each step is very simple. Jalt very closely mimics that three step procedure.

But the main point is that a cpu is just doing the same simple thing over and over again. Fetch, decode, execute fetch, decode, execute fetch decode, execute trillions and trillions and trillions of times. Lookup arguments are very efficient snarks when you do the same simple thing over and over and over again. So it's a match made in heaven. So of course your ZKVM should be done with lookups.

And so that's the ethos of jolts. It's why I have quite high conviction that it's, if not the right way to design zkvms, at least one of the right ways. But I think it's a very natural approach to the space. I think it's an improvement over current approaches, and we'll see how it evolves. How big of a coincidence is it that this lookup approach happens to exactly match the typical functioning of a computer and cpu?

So I've been stunned myself at just how tightly the parallels seem to go between the right way to design snarks and the right way to design real hardware. This is particularly surprising because people definitely didn't think this would be the case until recently. And I'd say even now, many people still don't think this is the case. So snarks have performance limitations, the proverbs still very slow. And six months ago, a year ago, two years ago, the proverb was even slower.

People thought that they had to design new vms, new instruction sets tailored to the limitations of the snarks, trying to strike a delicate balance between the primitive instructions being sophisticated enough that they can get a lot of work done, so you don't need a billion instructions just to add two numbers together or something like that, but somehow simple enough or tailored enough to the snarks that the snarks could prove each instruction quickly. So there's this attempting a delicate balance and designing these new vms that were designed with snarks in mind and no computer architecture person would ever design themselves. What Lasso and jolt strongly indicate to me anyway, is that exactly the properties that make some instruction sets really nice for real hardware also makes them really nice for snarks. And that seems like. Feels like a bit of a cosmic coincidence.

But as a result, it seems that the parallels between what's good for real hardware and how we should design snarks go quite deep. I think you can understand a lot of the snark design space if you already understand basics of how we design today's computing infrastructure without any snarks around. I think with the one giant caveat, that currently we're over a 256 bit field, which is not very computer friendly. Say, for example, a lot of the current performance strategies rely on GPU's. They rely on the work done by the gaming world and now the AI world that is driving people to make faster and bigger and cheaper GPU's.

Michael Xu

At scale, those things largely target 16 or 32 bit floats. Either way, very small data sizes. A lot of the other snark approaches can sort of piggyback on that effort and the billions or trillions of dollars put towards making those things fast, whereas our field size is still massive at this point, it's a 256 bit field. It would be really nice if we could go back to 2003 and convince the gaming people that they needed 256 bit shaders so that we could also have the acceleration today. But I guess there's a path as well to getting rid of that constraint that is not hardware friendly.

Justin Thaler

So I can give a little more context for what Sam just said, and. I'd love to know what has to be done to make this compatible with GPU's, given how much energy and enthusiasm is going toward that right now. Yeah, great question. So I'll give some context, and I'll try to answer that question, and then Michael and Sam should add on top of that, Sam just talked about 256 bit numbers and a field. So what is that?

So, like, all snarks work over something called a field. It's not really important exactly what that is, but it's basically the numbers that arise in the proof system, right? So the proof is literally just like a bunch of numbers, and the size of the field tells you how big those numbers can get. Okay. And Sam is saying that the way jolt is implemented now, the numbers can get as big as two to the 256.

So it takes 256 zeros and ones just to write the numbers down. The reason the numbers can get that big is because right now, we're using a certain kind of cryptography in Joel, it's a commitment scheme, or the only place cryptography comes into snarks is the commitment scheme. Essentially, we're using something called elliptic curve cryptography there. And elliptic curves work over these big fields, and so that is not the nicest for current hardware, although that's an. Apple, particularly at a 256 bit field.

Michael Xu

If you compile that to x 86 or ARM, where most of these things will be run, not the ZKVM context, but the actual machine where the prover will run will probably be x 86 or ARM. If you compile the 256 bit field multiplication or addition, which is the core operation that gets done billions or trillions of times inside of the snark, that ends up being between 81 hundred native cpu instructions. That's how many cycles it takes to execute a single one of those really core instructions to jolt, whereas some of the smaller fields, 32 bit or 64 bit, can be between one and five native instructions, which means one in five clock cycles of that cpu. And so that's the hardest thing, the hardest wall that jolt runs into. Of course, jolts algorithm is very different than these other things, and as a result, the number of total field multiplications or additions field operations you do is smaller.

And so we catch up, despite the fact that there's a 40 x overhead ish for our field. But it is the least hardware friendly thing that we have going on in jolt. And so, yeah, so to say just a little bit more about that, there has been a narrative in, like the snark design and engineering community that using elliptic curve based commitments is slow. In large part, this sounds very plausible, because the numbers arising are so big. Multiplying two of those numbers, as Sam just said, is like 80 times more expensive than multiplying two numbers, like on your computer, that fit in 164 bit data type.

Justin Thaler

So the numbers get really big. It's really slow to multiply them. You should avoid that, right? That's been the narrative. Roughly.

One thing this turns out to miss, or I think a few things it misses, is that with snarks based on the sumcheck protocol, fewer of these numbers arise. So the prover just does fewer multiplications. And so even though each multiplication is very expensive, the proverbs doing fewer them, and you can come out ahead. And there are other reasons that also offset the cost of these multiplications being so expensive that in the end, jolt, with elliptic curve based commitment schemes, turns out to be faster, modestly so, than the current fastest dkvms today that all avoid curves. But to be very clear.

Michael Xu

The ideal case would be that we get both right. We get the sum check protocol, which means the number of field operations we do is less to be able to prove the same statement. And we don't pay 40 to 80 times overhead for each of those field operations. And so that's, that's the ideal future that we want. Does this require extracting elliptic curve cryptography from the methodology here?

Justin Thaler

It's a drop in replacement. You can use any commitment scheme. And so today, Jolt is using elliptic curve based commitment schemes, and you can just hot swap out one commitment scheme for another. So the other commitment schemes that especially the ZKVM projects have been deploying today, the only cryptographic primitive they use is something called hashing. What happens those commitment schemes is the prover does like add and multiply a lot of numbers, but then after it's done adding and multiplying a lot of numbers, it does something called hashing them, which kind of scrambles all the numbers up in a way that this is like cryptographic compression.

And so this is one of the. Algorithmic techniques that you're thinking about employing? Yes. So we know how to design these commitment schemes either from elliptic curves or from hashing. Right now, we're just using the elliptic curve ones.

We can use the hashing ones. When we go to use the hashing ones, there are downstream complications or just side effects of it. The benefit of using the hashing ones is largely what Sam said before about being able to work over these small fields to keep the numbers nice and small, essentially. And so to fully exploit that in the jolt context, because Jolt is a very different approach than existing ZKVMS requires effort and things. But my expectation is that after we go through all that work on cpu's, on our machines and laptops, intel machines, Jalt will be about ten x faster, I hope, than it is today.

After we go through that change, that's. A huge leap in performance. It is. I mean, my view is, I think the drill prover is about 500,000 times sort of slower today than just running the computer program. That is about six orders of magnitude.

So ten is like one order of magnitude. So we turned that six into a five. I mean, that's a big deal. Even then, it'll still be a lot. I do want to just clarify one thing.

I think they're switching over to these hashing based techniques in the way that we plan to, won't necessarily be friendly to GPU's. So it's possible that if you really want to use GPU's you actually should stick with jolt and the elliptic curve based schemes. And there are some other benefits of using elliptic curves that don't necessarily come out in the numbers we can profile today, but they will come out as we build jolt out more and so forth. So I don't think that switching over to hashing is like it. You switch and you throw the curves away and you never use them.

I think these things will coexist. And it's very possible that if you want to use GPU's in the future, you stick with the curves and you pay for these 256 bit numbers that arise. But it's still pretty fast, because you're taking pretty good advantage of the GPU's. Again, the 256 bit numbers are like never nice, but they're also not the end of the world. And then you'll also reap the benefits.

I'm being completely vague about, you pay that price, you get some benefits. You dont get no benefits for the price. Preston, I want to double back on one of the things Justin dropped there, that the jolt prover is roughly the prover overhead. The amount of extra time that you need to do to compute the proof versus computing the native program is about 500,000 times, and thats today. At some point in the future, maybe that goes down to 50,000 times with some of these algorithmic improvements.

Michael Xu

If you circle back to what I said at the beginning, where one of the like comically core utility things we do in blockchain networks or Ethereum, is that we have 600,000 different validators that run the same program. They run the exact same compute back to back. If the prover overhead is significant, there's other cool properties you get. But if the prover overhead is higher, let's just say it was ten circumflex eight, or ten circumflex seven, as it used to be, you know, 10 million to 100 million, then it doesnt make so much sense to do this pre compression step, just not from a technical perspective, but purely a utilitarian or something perspective. To have a prover that runs 10 million times more work and then distribute that to 600,000 nodes, you might as well have just had all the different nodes do the original work, they might as well have just gotten the original program and each of them run it.

And if you sum the work that was done across all 600,000 nodes, it's still less than one machine doing 10 million times overhead. In a certain sense, slow provers, there's not that many uses for them. But as you bring the prover overhead down, snarks become more and more applicable over time. So this is why we're so focused on this is the number of use cases where people are willing to spend the 10 million times overhead, which translates to literal dollars on some cloud spend. Or your machine could be doing other things, it could be mining bitcoins or something like that.

As you bring that prover overhead down, the number of applicable applications for snarks grows dramatically. We don't know exactly what it grows to, what the new applications will be, but we're excited about bringing that number down. So to your point, how confident are we that we can get these performance improvements that are going to make this a worthwhile endeavor? I'm thinking back to when I was a kid and thinking about getting a terabyte of data storage and that being some fantastically big number and just super expensive, you can get that on a thumb drive. Is that the kind of pattern matching that we're looking at here?

Sam Ragsdale

Are we expecting improvements of that sort? Or is it going to be a real slog that we're going to have to fight for every ten x improvement? So I think definitely. Well, whether or not it'll be a slog, I guess, is subjective and it depends on who's doing it, if it's the engineer or at the academic level, but I think so. Starks started getting invested heavily in 2018 from an R and D perspective, and we've seen a steady march of improvements there.

Michael Xu

I think that you can expect the same thing out of the new strategy or the somcheck strategy, this multivariate land where people will hopefully continue improving them. This is one of the goals of us releasing it is we wanted more people to see that this stuff can be state of the art and please come help us make it more state of the art. Do all of the incremental improvements from here, because there's certainly things we haven't seen. The majority of the backend work was done by Michael, myself and Erasu, who's on the jolt paper. And so three of us have spent the time staring deeply at this thing over seven months or so, whereas the Starks have probably hundreds of engineers working on them.

And that's great, right? We want more people looking at this stuff, and we think that showing that a new approach could work and that so few people have looked at it, means that there's probably a lot of low hanging fruit. I mean, its easier to find a fruit to squeeze, if you will, and we can get those steady improvements. Then I think further, theres obvious stuff like just hardware improvement. So thatll only happen if theres clear demand for it.

But so far theres clear demand for snarks. People spend about, I dont know, $10 billion a year settling on ethereum, and so theres $10 billion a year of block space. If you can make that cheaper and more reliable and more decentralized and all this stuff, then people will probably make hardware for it as well. So we expect a market will form there in some capacity as well. I can elaborate on a couple points there.

Justin Thaler

One is further improvements in snark design are going to come from three things, better protocols, just like better algorithms, faster protocols, where things are just faster, the prover does less work, then better engineering, and finally specialized hardware. So better engineering might be for a cpu, which is not at all specialized to running a snark prover, but you might take the final step of developing an ASIC, which costs tens of millions of dollars or something and takes years to build out, to take that final step of having hardware, like all it does is snark proving. Completely specialized in snark proving. So these things will all sort of come together and build on each other. And of course, my work has maybe until jolt, where I didn't write a single line of code, but I sat there and commiserated with Michael and Sam and Arisu as they suffered through it has been on the first thing, like protocol design.

So I think we're going to get more improvements there. I think we've seen major improvements just in the last year. I mean, you have lasso, you have jolt. That partially inspired the folks from Ovatana, Ben diamond and Jim Posen, who introduced the new hashing based commitment scheme we plan to use called Binius. I think that's a very exciting development.

Yeah, I think we're seeing these sort of algorithmic protocol level developments happen at a very exciting rate. I was actually not so optimistic that this would happen. If you asked me two years ago, id be like, people have to use the sumcheck protocol thatll get us a speed up, but thats about it. Its going to be incremental. Beyond that, were seeing what I think is non incremental, and then eventually we will have the ASICs.

My view there is its so expensive and takes so much effort to build those Asics that you dont want to go through that effort before the protocols have sort of converged a little bit. This is one reason why I've sort of been anxious to make my views on snark performance known. If I'm right, we should switch to a large extent how we design these snarks. And if I'm really right, we can then converge. We'll see.

I mean, that's very optimistic that people generally agree that this is one of the best ways to do things. I'm probably going to regret these words, and it doesn't work out, but if you do converge enough, well, now it makes sense to invest in the ASIC, and then you get potentially an extra final two orders of magnitude off the prover time at the very end there. So you can hope for one to two more orders of magnitude here from the protocol level and another five x from the engineering level, and then 100 x from the Asics, and figure out how many orders of magnitude that took off from where we are today, and you get to the ultimate slowdown for the prover compared to just native execution with no proof. This kind of reminds me, this might be an analogy that's really far afield, but since three body problem is on tv now, just thinking back to the books, there's this one character who he wants to influence the path of how spaceships and propulsion systems are going to unfold. And so he sort of takes matters into his own hand to make sure it develops in the way that he thinks is the right way.

Sam Ragsdale

No spoilers. People die. But.

Arasu Arun

Anyway, no some check related deaths so far that we know of. No people were harmed in the making of this somcheck based approach. Yeah, I have many more gray hairs than I did when we started this. But other than that, I don't see them. You look spry as a spring chicken.

The zoom filter. Zoom filter. And I'm dying it now. Nice. Michael, speaking of genius engineering, before I want to make sure we hear from you, is there anything that weve touched on here you want to elaborate on?

Justin Thaler

Yeah. So we talked about how waso and jolt come from this sort of different theoretical lineage than a lot of popular systems today, which are primarily based on fry and starks. There are a bunch of theoretical reasons that we were expecting soon Joel to be so performant, but when it came time to actually engineer and implement jolts, we realized that the existing tooling, the existing infrastructure, wasn't ready to leverage those theoretical reasons. There were engineering things that we had to do to make sure we were actually reaping the benefits of those theoretical improvements. Right.

Sam Ragsdale

Theory is one thing, but actually putting it into application and practice is a whole other animal. Yeah, exactly. So one of the reasons why Jolt is so performant is that the prover only has to commit to these small numbers. Well, maybe I should just clarify. Sorry, Michael.

Justin Thaler

A major benefit of Lasso and Jolt is the prover only commits to small numbers, even though those numbers live in like a 256 bit field. Right. So like the prover commits to small numbers, but elsewhere in the protocol, outside of the commitment, we do have these very, very big 256 bit numbers arising. So just vary in the weeds clarification. But I just didn't want confusion.

Right. So, in theory, since the proverbs working with these relatively small numbers were talking like 32 bits or less, it should be much more efficient than if they were working with these 256 bit numbers. And the efficiency gains should come from a bunch of different places. So, for example, when youre converting the number to a field element, in our case, were using the BN 254 elliptic curve, this is really in the weeds. But you have to do this thing called converting it to the Montgomery form.

Arasu Arun

And doing that with a small number turns out to be, I think, like a quarter or 30% faster than if you had to do it with a full 256 bit number. But this is not something that archworks, which is the cryptography library that we use, was taking advantage of, because prior to Elasto and Jolt, you didnt really have these systems which had this property of primarily working with these small numbers. So there was really no reason to implement these, what was then esoteric efficiency optimizations. So we had to basically roll all these things ourselves. And I mean, I think we saw this across the board, where when we planned this out after Lasso Lasso, we built as a fork of Spartan one, which was by Srinath SeTI, one of the authors on both the Lasso and jolt papers.

Michael Xu

And that required some changes here and there. When we planned to jolt, we sort of decided we would take all of these off the shelf pieces, we'd grab them, we'd glue them together. And I think when we started jolt, we actually described, I think, in October or November, all we have left to do is the glue, which is just slap all the off the shelf pieces together and call it a day. It's going to be fast. We found that when we glued our makeshift thing together, it happened to be much slower than we were expecting.

And that's just because those dependencies, we're not expecting our call pattern and our strategy, it wasn't because any of those dependencies had any problems. So things like circum great library, a lot of people use it. It just happens that the way that we were using it. Where we were calling the same circuit over and over again was very slow. Same thing for Spartan.

Spartan is probably the fastest way to prove r one cs at this point. This is Srinath Sethi's library. But again, when you do our call pattern where we run, for every cycle we run the same r one cs circuit. It's much slower than it needs to be because you're doing some work that should be amortized across the entire program over and over again. So we had this pattern over and over again after we did the glue phase where we decided we basically had to vertically integrate everything because our call pattern is so strange.

So where a lot of the other lineages, maybe the starks and other stuff, a lot of people are working on the same thing. People run into the same headaches, and the tools have these fixes already built into them. We ended up having to do a lot of this ourselves, and we're excited for a lot of people to come and check out what we did, help us make it even faster because we certainly missed a bunch of obvious stuff. So when you were first setting out, you were like, okay, here are all these off the shelf tools that we can use. And as you were working on it, it was like, oh, actually, we're going to have to change this.

Sam Ragsdale

We're going to have to build this from scratch. We're going to have to make customizations here. I think what we realized is being able to use these off the shelf libraries and tools is in tension with performance optimization, where when you're trying to really heavily optimize something, you want to take advantage. You want to leverage the sort of structure that's unique to the problem that you're trying to solve. And this is kind of like a recurring theme that we encountered while working on jolt is like, where can we find structure that we're not currently taking advantage of?

Arasu Arun

Like, where are we being general, where we could be more specific. So this is intention with using libraries which are intentionally general purpose. But again, we ran into this situation where we realized we'd get a lot more performance by vertically integrating, introducing some optimizations that would only really be applicable in the context of jolt. There's a fun analog here too, to Lasso and jolt itself. Lasso takes advantage of the repeated structure in a cpu program, there's 50 instructions, and you do one of the 50 each step for a billion or whatever.

Michael Xu

However long your program number of steps and the proof system takes advantage of that, it looks at that structure and says, how can we simplify the whole proof system so that the prover is faster and we get all these other cool properties. And we ended up doing the same thing when optimizing the prover program, where we looked at the repeated structure of what was actually happening on our silicon, and we went, okay, how can we speed that up? How can we make it more obvious to the computer what it should be doing and what stuff is wasted based on the structure? How can you expect certain inputs? How can you make your program faster based on knowledge of what the shape of the inputs looks like?

Arasu Arun

I mean, really, that's the only trick that we have right when optimizing software is trying to exploit structure. So, I mean, I sort of come from a background of gas optimizations or gas golfing in solidity, and I was kind of surprised but also happy to find out that, like, a lot of the sort of fundamental skills, even though it's like a completely different language, completely different programming paradigm, are still relevant and useful for optimizing rust in the context of snarks. Yeah. So I thought I knew rust before I started this project. I did not know Rust before I started this project.

Michael Xu

I think Michael particularly learned a lot of his rust knowledge throughout this project, and I think we learned an enormous amount together where we really got into the weeds very efficiently of how to measure the performance, how to figure out what is slow, why it's slow, what system limits are we actually running into? How can we get rid of those? How can we shape this thing so that it still has general purpose functionality but has the performance characteristics we want out of this repeated structure? I want to ask also, how unique is this partnership between the research side, Justin, and the engineering side, Sam and Michael Justin? Are we basically PhD students?

Sam Ragsdale

What's the level beyond PhD? Then? Maybe you should get that. Oh, boy. In terms of suffering engineer.

Michael Xu

So I think I can speak to a little bit. When I got into crypto, I was very much practitioner are not academic like, that was never in the cards for me. And I think that the way that the engineering world communicates and the way that the academic world communicates are entirely different, where, like, one of my favorite examples of this is, like, Justin will often say, you know, don't worry, it's trivial, you know, and that triviality will be like, have been like a month of implementation work, which is just really meant to say that it's just like getting through the details and get certain that it will work, but it's somebody. Else'S problem and specifically your problem. Yes, exactly.

And so I think there was like a lot of, especially in the beginning, and I think we've gotten better at this over time, but just trying to figure out how to bridge that knowledge and information gap. And I think on our end, like getting better at understanding the information from papers directly, but I think also on Justin's end, being willing to not say it's trivial and getting to the weeds with us on the details. Right. Like actually, how do we manipulate these bits so that we can make this prover program run fast and then, you know, like after we get it to a certain point, making sure that the entirety of the algorithm and where we spend time and how the cpu cycles are spent is transparent to Justin and Srinath and Arasu, so that they can say that looks too long, the relative proportion of the time that the algorithm spending here looks too big, there must be a bug. And then we can go through it together, we can go through line by line and figure out sort of like why is this prover not performing asymptotically as Justin and Srinath and Erisu are expecting.

So I think we both started on opposite ends of the spectrum and we each came a little bit closer to the middle in terms of willing to get dirty details and also willing to think more long term theoretically. Michael, whats been your experience? Preston similar to Sam, I think I was fortunate in that. So I guess going all the way back, what sort of got me into the crypto space was I was very interested in number theory and by way of that, cryptography in school. And then kind of really due to career life planning on my part, I ended up taking a lot of theoretical computer science classes, but never actually doing any research.

Arasu Arun

So I just did like software engineering internships all the way through college, so didn't really have the option to go down the PhD route. But I was always interested in cryptography and kind of the more theoretical side of things. Any desire to go back into academia maybe? I never had any desire and I still don't.

Sam Ragsdale

Justin, how about from your perspective, being able to work with engineers to bring your ideas to fruition? Justin yeah, this has been a dream, really. The whole thing is just sort of magical and I couldn't have ever expected this, and it's been some of the most fun I've had in my whole career. It's not clear to me that this really could have happened in many other places, and I'm super grateful that it did. So give me the immediate next steps that you guys are planning to execute on, and then also maybe just sum it up and give you the end game too.

Like what is the world going to look like once all of this is assuming it all goes successfully and according to plan. Step one, vacation. Okay, vacation. That's a good plan. Step two, dot, dot, dot.

Michael Xu

Step three, profit. Profit. Yeah, so I think there are a lot of things that we are focused on. There are sort of some tech debt, things that we left behind in the race to get this out the door that I think that we will clean up. I think in that process it's hard to do this without getting too into the weeds, but we'll be able to make the interfaces more generic to be able to support many future directions.

We are still taking input from everybody that wants to use this thing. The reason that we built this thing and we made it open source was one to prove the idea that this sum check based stuff works in a RISC V context, and to prove that it can be state of the art speed, but we want people to use it. We are talking to a lot of teams that are thinking of integrating this stuff in various ways down the road, and we want to make sure that our roadmap reflects what they need out of it. Because it wasnt just to put a flag in the sand, but rather hopefully that a lot of people will adopt this tech and people can trust zkvms more broadly on a shorter timeline. I know you probably dont want to be too prescriptive and be divining what the future might hold, but if I could urge you to put on your speculative hat for a moment and just present an application or two that you're extremely excited about that you think could come out of this work, what would you say?

Michael's got some good ones. One that I have been coming back to is verifiable compilation. This is a feature that you see on block explorers like Etherscan, where you can view the source code of a smart contract that's been deployed to a certain address. How it happens right now is just typically the deployer of the smart contract submits the source code, and Etherscan runs the compilation themselves to verify that it does in fact compile to the bytecode at that address. With CK vms, as long as the compiler of the high level language, whether that's Sol C or clang or whatever, can itself be compiled down to RISC five, then you can prove that the compilation was done correctly.

Arasu Arun

So you can prove that a certain byte code was compiled from some source code. And I think that has some interesting applications, both for the ether scans of the world but also for just general software package registries like NPM or crates. I think it could be interesting and could maybe help mitigate some software supply chain attacks, but we'll see. I have two other hyper specific applications. There's the broader ones use it for block space or something like that.

Michael Xu

But the two hyper specific ones that I've been playing around with are something we're seeing in the consumer crypto land, Farcaster, et cetera, decentralized social. And one of these things that people have been talking about is they've been talking about this with Twitter actually for many years as well. But it's a bring your own feed algorithm. People are very concerned that centralized entities, Facebook X, et cetera, they own your feed and therefore they own your brain. I would add TikTok to that list as well.

I think that's probably the one people are most concerned about. They own the algorithm, so they own your feed, they own your brain, and maybe they can influence you in some capacity and you don't want a centralized entity to do that. So the narrative goes that you'll come up with a bring your own feed algorithm instead, and you'll apply your own algorithm over the distributed data, whether that's farcast or another decentralized social protocol. One of the issues with this thing is that this feed algorithm almost certainly is sort of like there are ML models, right? There are a bunch of floating point weights that don't mean anything to humans, right?

We try very hard on the interpretability domain, and a lot of very smart people are working on that, but currently it doesn't look so good. The best strategy is basically we sample the thing, then we see what it does, and that's how we know whether or not it works. Well, you can actually do that sampling process in a ZKVM, right? Like you could take any algorithm black box, run Monte Carlo over it with some sort of depth, you can create a succinctness proof of that. And then when you are in your bring your own feed algorithm in your client, you could pick one and your local machine can actually check that it has certain properties.

Sam Ragsdale

So you could actually rest assured knowing that the feed is complying with your selection and your preferences. This could almost certainly be backdoored in some capacity, but it's a direction that you can think about where you can use a succinct proof to sample otherwise, a black box thing and reveal some details about it. And then a very low power device such as a mobile phone or even a third world Android can run this stuff and be confident when it's connected to the network that you are getting an experience that cares about you rather than the platform's advertisers or something like that. That's really interesting. You said two hyper specific applications.

Michael Xu

Okay, so I'll give another one. One of the things people are particularly concerned about recently is that in order to get access to put your software packages, your applications, on people's devices, more and more, you go through an App Store of some description. That App Store ostensibly is there to guarantee that the software has certain properties so that all the users of those devices get a similar experience. Right. The obvious ones would be like, it doesn't have malware, it doesn't have, like, harmful content.

Define harmful content as you want to, but like most reasonable people would agree on what those things are. Generally, Apple justifies its 15% to 30% App Store tax. It says that they're curating it, they're taking care of the security, making sure nothing bad gets in. Exactly. And so they have a manual process that actually is a lot of work.

Right. I don't know if it's 15% to 30% of all App Store revenue work. I'll leave that to the lawyers to decide. But it's some significant amount of manual work for people, and people with, like, some engineering talent to go through and pick through this thing and say, subjectively, does it meet our criteria or does it not? And so, like you ask, like, you know, how might you decentralize that?

How do you remove the intermediary? Because that is also the intermediary that gets to say, you know, like, I don't want to get into controversial area here, but that intermediary could potentially say certain businesses that are not aligned with them don't meet our subjective criteria. You're kicked off of the app. You don't have access to most of the high value consumers in the western world, for example. And that would be a very bad thing for business.

You don't want your business to be dependent on something like that. Some future App Store, you still want it to have the properties where I'm not downloading malware, but you also don't want to give up full power over your ability to run your business in the future. You could imagine an App Store that's fully decentralized, but requires succinct proofs of certain properties. You run a malware scan over the whole thing, whatever that looks like. You run various other things that check for malicious content, et cetera.

Of course, the main concern people will have here correctly is as soon as you make that target public, it becomes very gammable. As soon as you make an antivirus target public, immediately the virus companies start gaming it. There's issues there as well, but it gives you an idea of maybe the design space you could lean into. Yeah. The dream, the vision that you're presenting is this idea that you could drastically reduce platform risk while also getting rid of intermediaries.

And you can't really imagine people doing that with arithmetic circuits. But if it's just you have to write rust that checks. Does this have a malware? That seems like a very normal thing that people are used to doing. You already write the malware checker, and if you can just compile it to a succinct thing and then deploy it to a some sort of decentralized network, thats actually a very reasonable future to expect.

Sam Ragsdale

Amazing. By the way, Justin, I have one quick question too. Zkvm. I know youve taken issue with people using ZK and zero knowledge to describe things. Is that still the proper way to talk about this stuff?

Justin Thaler

Yeah, I dont love the term ZKVM, but im less concerned about it than ZK proof, because ZK proof has like a very formal technical meaning that's important for privacy. The proof leaks no information about the data that the prover is proving. It knows. ZkVM is a bit of an informal term. I don't really think you'll find a precise definition of that, but it's basically a snark that lets the proverb correctly around a computer program step by step, specified in the assembly language of some virtual machine.

So because it's a less formal term, I think it's more clear to people that ZK doesn't mean ZK here it really means succinct. If it were up to me, I would replace it with like SVM or SVVM or something where the marketing people kill me. SVM has already been taken by Solana, right? SVM is taken by Solana. But the key property of a ZkVM is really succinctness.

So it should be succinct VM. Or it's like a VM that not only runs itself, but also outputs a proof that it ran itself correctly. You need a term for that, and there's no zero knowledge there. I mean, it could be a zero knowledge proof, or it might not be what you really care about. Is this proof at the end that the VM was run correctly?

And you want that proof to be short and quickly checkable there's no alternative. Term that's gained any sort of colloquial traction. CK VM is right. I think that we should keep using that for now. I can live with CKVM with proofs.

I'm like, just call it a succinct proof. And then it's clear what you mean. And there's not a danger that somebody thinks the proof is preserving privacy of the witness when it's not. With ZKVM, I think people know that it doesn't actually mean ZK. I think there's less danger that that leads to some kind of privacy breach down the line or something.

Arasu Arun

Well, I've already gotten DM's asking if they could use a jolt to do, like private auctions or something. Yeah. Okay, so it is dangerous. Maybe we should think more about replacement term. But I haven't really made much progress with the ZK proof issue either.

Justin Thaler

Everyone says it's confusing. Well, not everyone, but a lot of people say that, and then everyone calls them ZK proofs anyway. I really try to say succinct proof, even though it requires effort to do it. ZKVM. I haven't really fought that hard.

Michael Xu

What do we think about suck VM?

Sam Ragsdale

You know, it's got a ring to it. I think we're going to have to push for that. Suck VM.

Justin Thaler

I'll think about it more. Yeah. Well, maybe I just say one more thing. Yeah. I think the security issues cannot be overstated.

I think they are the biggest threat to snarks, not fulfilling their potential and even, like, being a net negative for the world. Everybody sort of focuses on performance, and I think that makes sense because performance is preventing people from using these things today. But they need to be secure. They're not. Now they're riddled with bugs, really.

If you use one today, you're really protected more by obscurity. You know, there's, like, a very small number of people in the world who can find the bugs, and hopefully those people are not going to attack the system, so. But the snarks are not protecting things today. The fact that it requires so much expertise to break the snark is what's protecting it, not the security property of the snark itself. So that needs to change, and I'm much more concerned about that than performance.

I'd like to make some progress on that in the context of jolt in the near future rather than the far future. We could have gone out on a high note, but you had to bring down the moon. Well, I think that's in character, so stick with that. Well, I hope you guys all take a long vacation, have some serious r and r after all this incredible effort you guys put in. Appreciate it.

Michael Xu

Yep, definitely. We'll try to find some time. Yes. All right. Well, hey, thanks again for joining.

Sam Ragsdale

This has been great conversation. Yep. Yeah. See you guys. See ya.

Michael Xu

See ya.

Justin Thaler

See ya.

Michael Xu

See ya.

Theory to Code: Building the Breakthrough zkVM Jolt

Primary Topic

Episode Summary

Main Takeaways

Episode Chapters

1. Introduction

2. The Genesis of Jolt

3. Technical Deep Dive

4. Practical Applications and Future Outlook

Actionable Advice

About This Episode

People

Companies

Books

Guest Name(s):

Content Warnings:

Transcript