Near Protocol: 'Blockchains Cannot Scale Without Sharding!' - Illia Polosukhin & Alex Skidanov

Primary Topic

This episode discusses the importance and mechanics of sharding in blockchain scalability, focusing on the Near Protocol.

Episode Summary

In this engaging episode of Epicenter, hosts Meher Roy delves into the intricacies of blockchain scalability through sharding with Illia Polosukhin and Alex Skidanov, co-founders of Near Protocol. The discussion centers on how sharding is essential to handle large-scale user bases and transaction volumes without compromising on speed or security. The episode starts with an introduction to the current challenges in blockchain technology, particularly the limits of state and throughput in existing systems. Illia and Alex then explain how Near Protocol's approach to sharding differs fundamentally from other blockchain technologies by allowing dynamic node and shard management, which adapts in real-time to changes in network load. The conversation also covers technical details of Near’s architecture, including its consensus mechanism, node validation processes, and the innovative ways it handles cross-shard transactions.

Main Takeaways

  1. Sharding is crucial for blockchain scalability, allowing networks to handle vast amounts of transactions and state data.
  2. Near Protocol implements a unique dynamic sharding mechanism that adjusts the number of shards as network demand changes.
  3. Cross-shard transactions and state management are significant challenges that Near addresses with advanced technology and design.
  4. The system's design allows for real-time adjustments and flexibility, ensuring high performance even under increased loads.
  5. Near's approach could serve as a model for future blockchain technologies aiming to scale efficiently while maintaining security and decentralization.

Episode Chapters

1: Introduction and Background

Illia and Alex introduce themselves and outline their backgrounds, emphasizing their shift from AI to blockchain technology for better scalability and user experience. Illia Polosukhin: "My background is all Near. Everything that was before is irrelevant." Alex Skidanov: "I was at Google Research before our startup adventures."

2: The Need for Sharding

The necessity of sharding in blockchain is discussed, focusing on its role in managing large user bases and transaction volumes. Illia Polosukhin: "For a billion users, you need to actually store all that state somewhere... sharding is how you partition this computation." Alex Skidanov: "It wasn't obvious to everybody that sharding is useful until we demonstrated it with Near."

3: Technical Deep Dive

A detailed look into the technical aspects of Near's sharding implementation, including node roles and consensus mechanisms. Illia Polosukhin: "Sharding involves splitting the blockchain state into parts that can be processed independently." Alex Skidanov: "Our design allows for dynamic management of these shards, adapting to the network's needs in real-time."

Actionable Advice

  1. Understand the Basics of Sharding: Anyone interested in blockchain development should familiarize themselves with the principles of sharding.
  2. Follow Near Protocol Developments: Keeping up with updates on Near can provide insights into the evolution of scalable blockchain technologies.
  3. Participate in Blockchain Governance: Engage in governance forums like those provided by projects such as Near to influence future developments.
  4. Educate Others: Share knowledge about the benefits and challenges of sharding in blockchain technologies.
  5. Explore Decentralized Applications: Experiment with building or using dApps on sharded blockchains to understand their potential and limitations.

About This Episode

The initial scaling roadmap for Ethereum featured execution layer sharding. However, the rapid advancements of layer 2 scaling solutions in general, and zero knowledge proofs in particular, caused a restructuring of the original plan. The reason was that rollups would have required less changes made to Ethereum’s base layer, hence lower risks. On the other hand, Near Protocol was designed from the ground up as a sharded system, capable of withstanding billions of transactions simultaneously, without sacrificing decentralisation or security.

People

Illia Polosukhin, Alex Skidanov

Companies

Near Protocol

Books

Leave blank if none.

Guest Name(s):

Leave blank if no guest.

Content Warnings:

None

Transcript

Illia Polosukhin

You need to actually store all that state somewhere, and for a billion users, it's actually going to be a lot of state. Usually everybody talks about gps, but actually one of the bigger bottlenecks right now across the space is state. Obviously no single computer, no single node can process all that can validate the whole network. Now you can rotate this validators who validate what chart all the time, every block they can be actually, you can randomly select set of validators. And they are able to validate this because they don't need to sync into the shard.

They don't need to process every single transaction that ever hit that shard. They only need to look at this specific block. The data availability and consensus are merged together into a single mechanic. The essence of it is you want to have a design that is not dependent on the underlying hardware itself, improving at a certain rate to be able to service user demands. You want to have mechanisms beyond that type of scaling.

This episode is brought to you by gnosis. Gnosis builds decentralized infrastructure for the Ethereum ecosystem with a rich history dating back to 2015 and products like safe cowswap or gnosis chain. Gnosis combines needs driven development with deep technical expertise. This year marks the launch of gnosis Pay, the world's first decentralized payment network. With Ignosis card, you can spend self custody crypto at any any visa accepting merchant around the world.

If you're an individual looking to live more on chain or a business looking to white label the stack, visit gnosispay.com dot. There are lots of ways you can join the gnosis journey drop in the Gnosis Dao governance form, become a gnosis validator with a single gno token and low cost hardware, or deploy your product on the EVM compatible and highly decentralized gnosis chain. Get started today at gnosis IO cars. One is one of the biggest node operators globally and help you stake your tokens on 45 plus networks like Ethereum, Cosmos, Celestia, and Dydx. More than 100,000 delegators stake with chorusone, including institutions like Bitgo and Ledger.

Sticking with chorus one not only gets you the highest yields, but also the most robust security practices in infrastructure that are usually exclusive for institutions. You can stake directly to chorus one's public node from your wallet, set up a white label node, or use the recently launched product opus to stake up to 8000 ETH in a single transaction. You can even offer high yield staking to your own customers using their API. Your assets always remain in your custody, so you can have complete peace of mind. Start staking today at chorus one.

Meher Roy

Hello, everyone. Welcome to Epicenter. Today I have an amazing episode lined up for you. We're talking to Alex and Ilya, the co founders of NIR, which is a in production sharded blockchain with a lot of value riding on it. Specifically, we will cover how near sharding actually works, try to build it concept by concept into an integrated whole, and then understand where they are in their journey to implement sharding, Alex and India, welcome to Epicenter.

So at this point, we've kind of done a few episodes with both of you. Maybe you could give a short introduction to your backgrounds. Yeah, I think at this point my background is all near. Everything that was before is irrelevant. But yeah, I was working on sharded database called MemsQL before NIR for five years, which is right now it's an in production sharded database.

Alex Skidanov

And before that I was at Microsoft. Yeah, I mean, my background is actually in machine learning AI. I was at Google Research prior to our startup adventures, and I was one of the cause. I was on a paper that introduced Transformers, which is technology powering chai JPT, new journey and other AI advancements. And then with Alex, we actually started originally in near as AI company and realized we need fast, cheap, easy to use, easy to build on blockchain because we wanted to use it ourselves for data crowdsourcing and some other data use cases for our AI company ended up pivoting to that in 2018.

Illia Polosukhin

And yeah, focusing on that ever since. Yeah, let's get into sharding blockchain scalability. So what is sharding overall and why has it been a generally difficult target industry wide. I think. I mean, maybe broader question is if we imagine having a billion users coming in and using blockchain as a means of payments, as a mean of tracking ownership, as a way to coordinate resources and efforts, you imagine that you need to have kind of a few things happening, right?

One is you need to actually store all that state somewhere. And for a billion users, it's actually going to be a lot of state. And usually everybody talks about gps, but actually one of the bigger bottlenecks right now across the space is state and kind of its growth. And so that's probably problem number one. Problem number two is, I mean, you have billion users thrive by transacting.

There's a, you know, hundreds of thousands of applications, obviously there's a lot of transactions flying around and a lot of processing that needs to happen and people want to put more and more complex smart contracts and complex logic indices. So you need to have throughput, bandwidth and processing power to do this. And so obviously no single computer, no single kind of node can process all that, can validate the whole network. And so the way to support this is one way or the other, partition this computation partition, this storage and partition this kind of band list to receive this. And so people have kind of historically been talking about sharding because that's how web two companies doing this.

So Alex mentioned doing this kind of in web two world namesql single store now is used by Fortune 500 companies. Google and Facebook have their own solutions. And it kind of seemed reasonable that this is an approach that you should take in blockchain. Now, there are a lot of problems that arise when you actually move into a permissionless setup compared to a permission setup, usually web two companies deal with. I also want to add, Ilya said, it's obvious that you need to share processing.

Alex Skidanov

I don't think it was obvious to everybody until recently. So there were multiple blockchains, which were like the favorite thing they liked to say was like, look at visa, look at how many transactions visa processes. And that's a world scale. Obviously we can do it on a single computer, but as a user, you don't use visa frequently, you use it like three times a day on a good day. So finally, now generally when Nir launched, we made multiple bets which were not obvious to everybody on Nier.

From day one, we had named accounts where you can rotate keys and we had sharding, and it wasn't obvious to everybody that it's useful. And now suddenly various scalable blockchains which are not sharded get congested and they have no way of getting any more performance. And similarly, when it comes to account obstruction, Ethereum right now is switching to that. So now finally, it is becoming obvious to everybody that those were correct decisions. Yeah, I mean, to give an example, any kind of a single, I call it single node blockchain, which is something that every single node in the network needs to process every single transaction and store all the state, right.

Illia Polosukhin

What it means is as soon as, like, let's say the network has a high capacity, they also have a huge state growth. They have kind of limit on how many transactions they can process because of the bandwidth and execution. And so at some point there will be more demand, like if, and this is a very natural thing, right? The price for the transactions usually based on supply demand. And so while transactions are cheap, at some point there will be more demand because they're so cheap to even just spam it to try to get some financial benefit from because the transaction fees are so cheap.

And so when that happens, you don't have a way to expand capacity, so your prices start to grow for everyone, right? And so this leads to now you kind of pricing out people who were using this blockchain originally for normal use cases because of the spam, and kind of people trying to run arbitrage for some other application. And so that's kind of just the principle where like any single, like kind of state machine, right, single state machine will get overrun and start rising fees, right? So kind of. Solana is the contrasting example.

Meher Roy

Of course, even Ethereum and bitcoin are based on the idea that the miners or the validators, and even the full nodes of these system have to process every transaction that is happening in the system. Solana has taken that idea and said, yes, we have a bunch of validators, I think 1800 or something like that, currently. And every transaction that goes through the Solana system has to be processed by every one of these validators. They are assuming that, okay, these validators can be placed in data centers where networking bandwidth is very high, which means they can ingest a lot of transactions from the network at a very high rate. They also assume that the machines are very performant.

So the work of accounting can be, you can assume that each machine can handle lots of transactions, do their accounting work, and then Solana would also assume that, okay, maybe the history doesn't need to be stored by these machines. They only need to store the currently, what are the different accounts and what balances they own, and what they kind of like a project like Solana assumes is the improvement in compute, in terms of like, bandwidth processing power. Every resource is, it kind of doubles on some timescale. So some resources double on twelve month timescale. Another one might double on three year timescale.

But because of this doubling, the capacity of the blockchain would keep growing at a certain rate. And the hope is that the user growth is actually slower than the doubling rate of the underlying hardware. And therefore, you continue to have like a cheap blockchain, whereas in the near case, near the opposite approach, where you say user growth can be way faster than the improvement in hardware. So fundamentally, you need to move away from the paradigm of every validator or every node needing to store. First of all, what are the balances of every account in the system are?

And then you also need to. So basically, no single machine may have a complete view on what the balances of every account on the near system is. And then it might also be the case that there's some machine, and there's some transaction on the network, and that transaction happened, but this machine is part of, is part of processing that network, but it never actually executed the transaction itself. And so the essence of it is, you want to have a design that is not dependent on the underlying hardware itself, improving at a certain rate to be able to service user demands. You want to have mechanisms beyond that type of scaling.

Illia Polosukhin

Yeah, I would add that, as I said, it's not even about users. It's actually, in a way, it's sadly, the economics, the economics of these blockchains is in such that you need to have a way to expand capacity. Otherwise, if you want to maintain low fees, because at some level there will be saturation, where some subset of people are willing to pay higher fees, because the planning, they try to capture some economic value from an exchange, token, launch, whatever trading. And that in turn increases price for everybody else, Solana and some other high capacity networks. Even though the idea that we have enough capacity and it will grow over time.

The reality, what happens is instead it gets flooded by transactions that are all trying to hit the same economic opportunity and extract its value. And so that people are willing to pay way higher fees than folks that are potentially using it for other use cases, for payments, for example, and others. And so that's kind of the point is, and this is to add to the fact that, like, the state grows and everything requires validators to continuously expand their hardware, even just to continue maintaining the network. So I think, like, to me, actually, like past, like three, four months have been a really great validation. And then this is not just about Solana base and other, like, kind of even, you know, basis centralized sequencer.

It's a single server effectively, but even that cannot get shopped with all the transactions that they need to process. And so that's kind of an example of just like, as soon as you have enough economic activity, you starting to get kind of this flood of transactions trying to capture that, and you don't have any way to either isolate it and add extra capacity for everybody else to do this. Right? And so example I like to use is imagine Netflix. You go to Netflix, and first of all, in the CDM ecosystem, it would ask you to choose which data center you want to watch from.

Their arbitrary data center, or optimism data center, a base or blast. And then when you go there, it says, actually, first of all, you need to bring your money from the other data center where you have your money, if you want to pay for this movie, pay for watching the movie here. And then second one is actually because somebody else watches a very popular movie, you cannot watch this movie right now at a lower price. You need to pay more. So that's kind of the current state, right?

And what we want to do is, you know, you go, you can pick any movie and you watch it and you pay kind of fixed fee. Right. That, like, is predictable for everyone. And so to do that, right. Similarly, how nested needed to use Amazon that kind of scales under the hood, right?

And is able to build more data centers kind of ahead of the demand that Netflix has. Similarly, you need a network that is able to scale with demand. And in a way, you have the supply demand curves. And so you want to flatten the supply curve such that even as demand grows, you kind of can maintain their fixed fees. Right?

Meher Roy

So this is the distinction between burst capacity and, like, average capacity in a sense, where, like, a system might only be using, like, x capacity certain times in a year, but then suddenly, like, one application or the old system might require five x or ten x, and that burst might happen very quickly. And what you're saying is essentially that if the, if the scalability properties of a system are only dependent on the underlying machines that the validators use, then that cannot change very quickly to adjust to burst demand, like, suddenly lots of demand comes in. Machines can't be changed across the whole network that fast. So there needs to be, like, some other mechanism where a burst happens, and the system is also able to somehow respond and be able to scale dynamically on a shorter time horizon. And this is a property nearly every blockchain kind of like, lacks today, which is why you have gas congestion.

Alex Skidanov

Yeah. And so specifically, in last months, we went from four shards to six shards, increasing our capacity by 50% because we had a couple of applications who had massive growth. We have hot, which grew from zero to 5 million users. It's like over a month. Million daily actives within a month.

Illia Polosukhin

And so we started to having this actually congestions without the shards because of that. And so instead of just, okay, everybody is now paying higher fees, their network added more capacity. So, yeah, let's get into what are the different challenges with building a sharded blockchain. So there's a couple of them. First of all, there are certain changes to the user experience, because since nobody maintains the state of all the accounts, if the transaction has to touch multiple accounts, something needs to be done about it.

Alex Skidanov

And it's a very large design space, and generally we refer to those transactions as cross shard transactions. And that's one big challenge. The second challenge is in the network in which every node processes every transaction. If you have a node and you're at a particular state, you have very high certainty that every transaction was processed properly, because you literally processed each and every one of them. In the worst case, you can be in a situation where you're looking at the network, which is not canonical.

So maybe some other part of the network believes that another set of transactions happened on this one. But that's a very different problem from someone literally making something that doesn't make any sense according to the rules of the chain, in the sharded chain, because every node only applies a subset of transactions, you need some checks and balances, which ensure that what you see is actually a correct state that results from the properly executed transactions. When you start digging deep into that problem, into the problem of ensuring that everything is executed correctly, then you start facing another problem where in order for almost any mechanic you can come up with, that ensures that everything is executed correctly. Maybe with an exception of Ziki proofs, like once we start digging today, we will see you need to be able to access certain information in order to perform the validation, and that information could be made unavailable by malicious actors. And so you need to have a sort of native, you need to have a native mechanic on the blockchain, which ensures that certain pieces of data are available to certain participants and cannot be concealed.

Right? And so those two challenges are like one of the biggest challenges that exist. There are some others. One interesting one would be that because the state is so big, you need to have a way for people to either synchronize state very quickly, or work without synchronizing state. So I think those four would be the most interesting ones, right?

Meher Roy

So, yeah, essentially, imagine yourself like being an accountant of the near blockchain. It's a massive data structure. You only have a small part of it, the transaction comes. So the first problem is, okay, you might not be able to process the transaction fully yourself, because you only have a part of that entire data structure. So you can maybe make some changes to that part, but then the transaction might hit okay, now it needs to do a change on some other part that you don't have, and that's a completely different accountant.

So you need to process some, only a part of the transaction, and then it needs to be like that relay bet on, needs to be kind of handed over to some other party and then they do that and so on. So that's one issue. Second issue is if you're only handling a part of that data structure which has contains all the account balances, you receive that data from someplace, and then how do you even know that that data is genuine? Right? So that's the problem of stateless validation.

Or like, how do I know that this data I'm receiving, it's actually processed correctly in the past, then kind of like, okay, if there's any mechanic to know that any kind of certificate that would tell me that this was processed correctly in the past, then the generators of that certificate kind of need to have had access to that data in the first place. But if that data wasn't there to the generators of, like, these certificates, then they won't be able to generate certificates. A certain data needs to be kept in a state where you can reach it in order to do something with it. Yeah. So a couple comments.

Alex Skidanov

So the first one you mentioned that you need to process part of it and then send it to someone else. So it's also important that that message you're sending is not getting lost. Right? It has to be delivered with certificates. You mentioned that you need to be to have certain data to create certificate.

It's also in many situations, the case that you need certain data to be able to verify the certificate. Actually, like in near, in one sense, like, there's a lot of complexity because fundamentally, like kind of like the accountants in your system may not have access to the full data. And so that leads to a lot of complexity. But in a different sense, near is similar to other blockchains in it. That is just one data structure containing accounts and like smart contracts and their data.

Meher Roy

And this is exactly the same as kind of like bitcoin and ethereum, where you're dealing in the end with a single data structure. Near is managing that data structure in a different way, but ultimately it's a single data structure that's a unifying property across all of these systems. Is that right? Yeah. You can think of near as a very large mapping from account ids to state.

Alex Skidanov

A big difference from other blockchains is that the account id is not your key account id. Think of it as like a domain name, right? So like, you know, I would be Alex near. And it's not just a convenience. It's not just something that is easier to convey to other people.

It's also like, if you think about it, if the key is what identifies your account, then if for any reason. You have a reason to believe that the key is compromised. Your account is gone, right? You have to create a new one. You have to try to move the assets.

Not all the assets are movable. You can imagine an NFT which is not movable by design, and NFT, the first user of this feature, it's not a movable NFT. That NFT is gone forever.

If I have a reason to believe that my key is compromised, I just change it. It also allows you to auction your account. Like your account has a particular state that you think is of value. You can literally sell your account. There are services on Nier that allow.

Illia Polosukhin

You doing that, right? And then this massive state of mapping. But at the end of the day, it's still mapping from an account to state similar to bitcoin, Ethereum, Solana, any other blockchain, but that state is shorted. Just. Just to be clear, not bitcoin because Etxos, but yes, ethereum, solar, every account based blockchain.

Alex Skidanov

But yeah, you know, like Utxo, you can think of it is even less persistent account than just a key. And so this state of all the accounts, it's split into multiple subsets, right? Into multiple sets. So today we split by name. So there would be a contiguous set of accounts that lives on shard one and there is a contiguous set of accounts that lives on shard two and that those boundaries are not, they not immutable through the, through the life of the blockchain.

And as a matter of fact, they did change multiple times when near launch, it was a single set near launch to the single shard, right. And then it was split into four. And then in the recent months, it was split, two of them were split twice into two again. So it's. And in the future, that will be dynamic.

In the future, the system will be changing the boundaries as they, you know, like as the load changes automatically. So, I mean, maybe one way to imagine it is like in your, like in the postal system, in your city, most likely. Like your city is kind of like divided into these different regions, each with a different postal code. Right? Where I live, they're called here plz or something like that.

Meher Roy

And so each kind of like, region of the city will have a number and the city will be partitioned into various different postal codes. And you'll have like a post office in every, every code, essentially, and can imagine, like in near, it's taking that data structure, you can think of that as the city and then breaking it down into like, these areas, these shards and the dynamism is like, if you have, like, a region in the city that has a postal code and suddenly lots of letters are being sent there and they are like, oh, now, actually we need two post offices, then maybe they will divide the region in the city into two different regions with two different post office with two different numbers. And in practice, that does happen, right? Like, postal codes change over, over long horizons and in near. Similarly, like, okay, the whole blockchain is broken down into, like, these shards.

And the definition of a shard can also change in order to kind of route around the capacity demand in some way. If you think of near today, you can think of it as there is a, you know, we all observe what happens in the city and how much mail goes to every post office, right? And at some point we realize that for a particular post office, there is a lot of mail coming in and it's, you know, it's getting harder for the employees there to handle it, right? And so there could be a proposal that says, hey guys, let's build another post office half a mile away and split it like this, right? And then there's a separate entity, which is validators of the network, which either choose to go ahead with this change or not.

Alex Skidanov

And if sufficient percentage of them, which I think is 80, wants to go with this change, then it happens. If you think of near of the future, when the name of your sharding is there, it's slightly different. As mail starts coming into the building, another building just spontaneously pops in without, without any human being involved, right? More like that. Building splits into two and moves away.

Illia Polosukhin

Exactly, exactly. And you wall appears. Yes, to building separate, like on a set of rails. And that happens without any involvement from any natural intelligence, right? It just, it just occurs.

Alex Skidanov

And then at some point, you know, two post notices become, you know, it's getting chill there, so they kinda come together and the world disappears. Yes, yes, exactly. But I think the important parts here, as you said, zip codes changes, which means when you're sending a mail, you need to. Now, whoever is in the news zone needs to update everyone. They're in a new zip code.

Illia Polosukhin

Everybody needs to. But on near, you never see the zip code. On near, you say, I want a mail to be sent to Ilya and NIR figures out the zip code itself. You don't need to know it. That's the beauty.

To compare with this with other approaches, like subnets, roll ups, etcetera. I mean, in a way, they're trying to emulate the same here, right? It's like, oh, you know, this roll up is too busy. Instead of launching there, you can spin up a new roll up and now everybody can go there and you have more capacity. But it's very, not just manual and expensive process.

Right? Like each roll up cost us at least a million dollars a year just to run between sequencer explorers, rpCs, et cetera, all the infrastructure. But it's also now every user, every developer, every smart contract that's actually trying to use it now needs to figure out how to go there, how to bridge there, what gas tokens is used there, etcetera. So it's a huge load on the whole understanding of the network process, which we are actually addressing with chain abstraction and chain signatures as well, because we do believe this is what we're trying to do with near is a universal problem. It's like the capabilities of the network should be able to change dynamically and everybody should be able to rout things without thinking about the underlying infrastructure.

But on near, we solved it in a very direct way by having this kind of namespace that is common for everyone and using that to route transactions or messages, mail between different participants. Yeah, that is so cool. I actually own Meher dot Nir. And yeah, I've never needed to think about what shard it is on. So to me, I only need Meher near and its journey through time maybe like it was processed in shard number one, then shard number three, and then it's changing and I never need to know about it.

Meher Roy

Right, like that's, that's, that's like really? Yeah, I think it wasn't shard. Yeah, shard zero, then shard two and then shard three. Now it's on probably shard three right now. But yeah, you definitely don't need to know about that.

Illia Polosukhin

And like, even I am like, you know, there is a way to look it up, but we actually don't show it in explorer usually because, I mean, some new explorers show sometimes, but because we don't actually want people to know because it's irrelevant information. It's like knowing which, which exact computer in, on which rack in AWS is, you know, providing us with this interface we're looking at right now. Yeah, yeah. So maybe like an interesting imagination for Nier is because like ultimately it's like this human memorable names at the bottom. And maybe, you know, like each human memorable name actually corresponds to a person or a company because that's how the world is partitioned.

Meher Roy

And because like the near system breaks into shards, you can almost imagine like a virtual collection of people that are transacting their business on a shard at a certain point. And these are the users of the shard itself. And of course, as the shard boundaries change, the collection of people transacting on a shard is also changing. But maybe if we imagine it as like being stationary or constant for a certain while, which is true for near, we can think of this like virtual collection of people in the near suburb. On a sense that's transacting on a shard.

And then on the other side, corresponding to a shard, you need kind of servers or validators or postmen, in our early analogy, that are kind of processing the mail that's coming to that particular area, or shard. I guess one of the question starts to become so in any blockchain network, you have these validator machines or minor machines that are ultimately like kind of like this postman or accountants of the system that are doing the processing. And even in near, I would imagine, okay, there's a set of validators that are. That are found through proof of stake. Now, these validators sort of like need to be assigned to shards that, hey, you go and process the transactions in.

In this shard, you other one, you do it there. And how does that process work? So I first want to correct slightly the first time. I think the second analogy was good. But the first analogy was not quite correct.

Alex Skidanov

Because even though shard boundaries do not change as frequently today as they will be at some point. It has been. The case from day one on Nier that two accounts residing on the same chart has absolutely no significance for those two accounts. So I think a better analogy for that would be not a post office, but like cell towers, right? We could be next to each other, and I can call you and we will be served by the same cell tower.

Or we could be in different parts of the world. And I can call you the different cell towers, but we have no benefit. I will never know that you're on the same cell tower. And I will never care. And it is the case on near that if you and I own the same chart and I send you money, or we transact on some smart contract, or if we are in different shards, from the user's perspective, there's no difference in experience.

So fees are not affected, the performance is not affected. Sharding is completely abstracted to it. And so there's no incentive, for example, to try to be on the same shard. There's no incentive to grind, for example, account ids or intentional accounting in the same chart. When it comes to the second analogy, you can think about this way.

You can think, I like going in multiple steps and effectively saying, let's say we're designing a new blockchain and we want it to be sharded. How do we ensure security when not everybody processes every transaction? And the first idea would be, let's say we have a massive set of. Validators.

The minimum key to be relatively low. And we say we have hundreds of thousands of validators, or even millions, I don't know. And then every shard, even though it has only a subset of validators, it still has a massive set of validators. So we have a million total hundred shards, and every shard has 10,000 validators, then you can say, well, if we sample them randomly and we relatively certain that the total set of validators has up to a certain percentage of bad gains, right? We say, we believe that the total set has up to 25% bad guys and not more, right.

Then you can do the math. And you say, well, if I sample 10,000 of them, then the percentage of the bad guys exceeding 33% is so unlikely that we can consider it to be impossible either to one over ten, to the power of some large number. And then you say, well, and because there's no more than 33% of bad guys in the shard, we can just assume that they adhere to the protocol that any state transition they approve of. Like if it has, you know, as. A particular percentage of signatures of those people who are validating the shard, then we believe that that state transition was valid because the number of bad guys is limited and the good guys signed off.

So it's, you know, good to go. It has practical issues. We don't actually have a million of validators, and we do want to have more than 100 charts in the limit. But it has a bigger problem, which is the contract of a big guy or a good guy, is very abstract. At the end of the day, everybody who's on the blockchain, they want to make money.

That's the ultimate goal. Majority of validators, I'm sure there are some validators who are there to build the decentralized world of the future where everybody's a happy corgi owning their data, but in reality, majority of them are there because you stake money, or rather, you have people delegate to you. You keep the percentage, you make money correspondingly. We should think about the security in the presence of the bad guys who will try to corrupt other participants. And people talk.

There are wasteful people to talk. A lot of validators are just sitting on telegram, and it makes sense for them to be in the same telegram groups because they run into issues. The network is too slow. They need to know that they need to operate the validator. So they all in the same telegram channels.

They all easily reachable. If the bad guy wants to come and say, hey, guys, I want to do this act of being a bad guy, and I need that percentage of validators to cooperate, I can dm each of you and say, this is how much money I'm willing to pay you because there's something for me to gain from the blockchain, gain going down. It could be some minor extractable value. It could be that I'm the Solana investor and I just want Nir to go down and Solana to go up, etcetera. The system needs to be designed in such a way that a very large percentage of the validators could be corrupted and incentivized to do something bad.

And we need to correspondingly, let's say we have 100 or 1000 validators and we have on this small sets subset of them in every shard. We should expect that almost all of them will get corrupted or even all of them. The system needs to be designed in a way that there are those postmen in the post offices, but it could be that a bad guy enters the post post office, gives all of them $1,000 and asks them to do something bad, like a route and a mail which was not actually sent by the. Originator or something like this. And so we designed systems in a way that is prevented or made very difficult to execute, if that makes sense.

Meher Roy

So now, I don't know the mathematics of it, but if we take that, that scenario that you sketched out, first there's a million validators, and then I'm sampling. What I mean by sampling is, you know, it's a huge set. And I'm taking 10,000, and I'm choosing randomly because the blockchain ultimately needs to choose randomly. And then if I get the set of 10,000, my, my mathematical intuition says that if, like, 25% of that set of million is 250,000, they are malicious in some way, and 750,000 are kind of honest. If I am choosing 10,000 randomly, my odds of kind of choosing a set where the majority, maybe 66%, is malicious, means 6666 are kind of malicious and the rest are good guys.

I would think, like, that would happen pretty frequently, right? Like, if I'm, if I'm, if I'm creating these sets, like once every second, I would think every six months or every year, I'm going to get a set which is, which is full of. Full of malicious. Now, so if you have. If you have 250 out of a million, then even sampling one third will never happen.

Alex Skidanov

If you sample 10,000 out of a million, then even sampling one third would not happen. It's extremely unlikely. Oh, it's extremely unlikely. So the, so the. My physical intuition is wrong.

Meher Roy

And if I'm sampling 10,000 out of a set of million, and I keep doing that, keep creating new samples, you. Can do it like billions of billions of billions of times. Like, you can do it every nanosecond for a billion years, and it will not happen. We actually have the math in one of our original papers, right? And it was, I don't remember the exact, exact constants, but yeah, it was like ten to -20 -30 something like that probability.

Illia Polosukhin

So like longer than universe exists kind of thing. But to an extent, it doesn't matter because we still, like, if your intuition was correct, or if my math was wrong, which is possible, it would only make the situation worse. Right? But we're saying, we're saying, hey, let's say the math is such that. And I think it is.

Alex Skidanov

Let's say the math is such that it's very unlikely to sample a large percentage of bad guys. It still doesn't matter because good guys can become bad guys when they sufficiently incentivize to be bad guys. And then in the world of blockchains, incentives are often such that you can benefit a lot by corrupting a set of validators. Every now and then, the validation will choose to do so, right? And so correspondingly, you want to design a system where even if the set of validators in the shard is corrupted, right, either due to sampling or because they were corrupted, what we call adaptively, right?

So after the fact, then the system still operates reliably. And so there are multiple, you know, there are many ways of ensuring that. It's the area of research that has been developing. And this is where in sharding, we use, you know, this is the biggest user of ziggy proofs in sharding. Because, you know, if I can just prove that my transition is correct, then the problem is solved, right?

Instead of sending a bunch of validators, if you can just have a proof that says everything is correct, that it's. And the problem is solved, that may be building. Building this up from the. From bottom up, right? I mean, it starts with so kind of, we discussed, right?

Illia Polosukhin

There's this account space, right? You know, think of it as a city with people, and, you know, we have the cell towers. So when people call each other, like, fleas of the. Of the analogy, when people call each other in the city, right. Kind of bounces to their cell tower and then goes to another cell power, too.

Can I connect them? And so the first thing that we need to do is to ensure that every second, every transaction is recorded, ordered, and kind of across all shards, and that there's no way, even for, if everybody is corrupted in that shard, to be able to kind of change that order. And for other use cases, also potentially introduce something that's not valid. That's called data availability problem. Near had data availability from 2019.

Then we designed manshade. And then as other approaches, like roll ups, et cetera, started becoming more popular, I also need a data availability, and that's where a lot of the current data availability offerings are coming to market.

Alex Skidanov

A very short primer would be, let's use roll ups as an example. Let's say there's hypothetical optimism, and they do transactions, but they do not use zero knowledge proofs. I shouldn't make optimism is planning to use zero knowledge pools. Right? And so, correspondingly, using zero knowledge pros for fraud proof.

I see. Interesting. Interesting. And so they check in the, you know, the state route to ethereum every now and then, and they say, you know, I applied all the transactions correctly, and if you think I did not, I posted just enough of cryptographic information of, like, you know, attestation that. That if I did something wrong, you would be able to come and prove that it was wrong.

And if you do that, I will lose a lot of money. And so I have a strong incentive not to do so. And so then you observe the roll up, and you see that some transaction was applied incorrectly. You can go to ethereum and say, this is a transaction. It was applied wrong, and this is the proof.

But you can only do that if you actually see the transactions. If the roll up was able to operate in such a way that the validators cannot see the transactions, then the roll up would be snapshotting something. But nobody can prove anything wrong because nobody sees what's happening. It's like in a field room. So data availability is effectively this concept of ensuring and proving that the transactions that you claim to apply and the state on top of which you claim to apply those transactions are all visible to everybody.

And that's something that Nir had from we were either first or second to have it live. I don't know when, I don't remember when Polkadot launched and whether they had data availability from day one. Okay, so you have like a shard and then there's a, there's a set of validators that are processing transactions in the shard. Let's first get an intuition of like this set of like let's imagine like shard two or whatever, the set of validators that are processing shard two. Is it like a static set or does it keep changing, changing with time?

Meher Roy

Dynamically it's changing with time. So the idea is, and so there is like what's right now lives and what's been live for for a few years. And also what's we launching with status validation, I'll probably just talk with state about status validation just for ease of explanation. So with status validation, there is a two set, kind of two roles that the validator can play, and all of this is repeatable. One role is so called chunk producer, which is somewhat similar to what you enroll up, called sequencers.

Illia Polosukhin

This is the node that receives the transaction, orders it in the block in a junk inter case responsible for their shard. It sends out this chunk to others as well, and then executes the transactions and receives the result. And so importantly, where comes in the data availability? When the Chunk producer sends out the chunk information, they include, they kind of so called erasure coded, which means they replicate this information in such a way that they send it to other cell towers, such that even if everybody who's in servicing the cell tower is offline, goes malicious, et cetera, other cell towers can completely replicate everything that happened in the cell tower. So that's kind of what data availability erasure coding is.

So there's a chunk producer. Now there's a small set of chunk producers, kind of similar how there's single sequencer, usually on roll ups, but you don't want that because of censorship and reliability. Now for validators, actually you have a different story, right? So what Alex mentioned, you have this adaptive corruption problem, right? So if you have validators which are sitting there for a long term, for a long time, it's possible that you go and say like hey, if you're in this shard and I see you in this shard, for example, I can bribe you to do something for the shard.

And then you need fraud proofs. And fraud proofs are complicated and kind of require additional timelines. And so with status audition, actually the chart producer not just produces kind of the transactions, but also includes all of the state required to execute these transactions. And so that's so called state witness. And so then any other node in the network, right, can receive this block and execute it without knowing anything else about the network except like client of the network.

So you receive everything, kind of. You need to validate that. If you apply this transaction and you have the state, and the state was included in the previous block, then the result of this, and you can confirm that and kind of send the confirmation. And so that's kind of, I mean, in a way it's, you know, kind of state of validation is ability for any node to come in and say, okay, I'm ready to validate, I don't need to synchronize the network state, I don't need to maintain the state in on my disk, right, which again reduces the needs for the validator node requirements. They can just like make sure that everything's okay.

And so now you can rotate this validators. Validate what chart? All the time, every block. They can be actually, you know, you can randomly select set of editors, they can be overlapping. You can select, you know, out of the million, you can select, you know, 100,000 per shard.

And they've evaluated ten shards, for example, each. If there's enough capacity or any kind of parameters, and they are able to validate this because they don't need to sync into the shard, they don't need to process every single transaction that ever hit that shard. They only need to look at this specific block. So that kind of opens up a lot both you can imagine. I'm actually really excited that potentially not probably this year, but soon enough somebody can open up a new tab, type in a URL which has a validator node in the browser, and actually starts receiving blocks and validating them, because again, you don't actually need anything else.

And browsers have webassembly to execute the transactions embedded. That's very lightweight validation that you can rotate all the time. So like I am imagining this state witness is as more like, so let's say like there was block n and then the transaction came in and the Chunk producer made n plus one. But is it the case that for every transaction, almost like for every transaction that they processed in that block, they are creating individuals witnesses for each transaction? So if you take transaction one, they'll say, okay, transaction one modifies these two accounts.

Meher Roy

These two accounts had disbalance previously after the modification. The result is these two accounts have disbalanced now. So our state witnesses, this was what was before this transaction was processed. This was after the transaction process. But the data that's being supplied is only those two accounts that are hit by the transaction.

And so for every transaction, you are just creating like these, like these breadcrumbs, this bare minimum amount of info that is needed to kind of validate that transaction, and that's the state witness for that transaction. So every transaction that comes in this, the block of the shard by this chunk producer entity is kind of broken down into these individually verifiable pieces. And then those individually verifiable pieces are scattered across all of the validators of the shard, and they can kind of do the job of verifying these individual pieces one by one. And because each of these piece can be verified individually, you, that's why you are able to run the validator in your browser. Because while your browser may not be a powerful machine, it can still validate a few of them.

Alex Skidanov

Yeah, that's right. So is it the case that this chunk producer, that's a more long lived role, and then the validators of a shard is a more short lived role? Meaning as a validator, I'm doing some verification in this shot, then few seconds later in a different shot, in a third, few seconds later in a different shard like that. I'm constantly switching as a validator, but as a child producer, few seconds later. Is a little bit, is a little bit consulting.

We produce a block every second. But yeah, it's like every second, every block, you can be valid in a different shard because there is actually no difference. Like you don't actually care which chart it's on because for you it is just a block with all like a set of transactions with all the information you need. And so that's why you can rotate validators now every second. I mean, there's like networking requirements and some, you know, data information propagation.

Illia Polosukhin

But in principle, yes, it can, it can rotate every second. And then for chunk producers, they rotate every epoch right now, which is 12 hours or 1214 hours. And there where because you need to actually sync the state. Like if you moving to the next shard for the Chunk producer, you actually need to know everybody's balances, right? And so you need to actually download all that, make sure it's correct, consistent, and then kind of while you're downloading, you actually need to now receive new blocks and apply them as well.

And so you're kind of doing two jobs in parallel. You validate it, you kind of chunk producing the shard you're in now and you getting ready to produce shard that you are next time. And so that requires kind of a more sophisticated setup. Right? So, okay, so then you have these validators that are constantly jumping from shot to shot, validating some small pieces.

Meher Roy

And when is it the case that when I'm validating a certain piece, I'm also adding my signature, saying, yes, I checked it, and it's correct. And every transaction and its witness is kind of getting more and more signatures or attestations from the validators. And that's how you're building up trust. But it's not accumulating over time. All the signatures happen on the same block.

Alex Skidanov

So every block you need to do a sign off. And it's not the case that everybody signs off every block. We just need a certain majority. And, you know, because blocks are created every, every second, and the validators are running on the relatively commodity hardware, sometimes you will miss a signature. And, like, if you go to an explorer, you will see that, like, nobody has 100% uptime, people have, like, 99%.

Illia Polosukhin

Right? But, yeah, but the idea is that, yeah, there's a set of validators devalidating a particular block. We know whom we expect to sign off on the block. And then, you know, a majority, a percentage of them signs off, and the block is created. You can look at it and say, well, that many validators signed off at this point, I know which charts they were supposed to validate.

Alex Skidanov

I know that unless someone corrupted them in 0.1 milliseconds, there's a relatively high certainty that the state transition is valid. Right? Because on the other side, the mathematics says that when I'm sampling these validators randomly, my odds of getting a completely bad set of validators if 25% of the set in the bigger network is corrupt, is kind of low. So because of that sampling, you're able to kind of trust that, okay, while these validators are considered, or you are sampling them constantly, every second you are changing the samples. But because you calculate the probability of one sample being malicious, like 66% extent or something, as being very low, you are able to trust kind of like the signatures on the witnesses of your transactions and be sure of the state of a particular shard.

Meher Roy

And meanwhile, like these chunk producers, when they are producing these blocks, they are also forwarding the data corresponding to these blocks to a set of validators. Now, this other set of validators may be different from the validators that are checking the witnesses of the transactions. They don't need to be the same set. And so, yeah, so there's two things that happen. One is you.

Illia Polosukhin

I mean, the validators that are receiving to check the witnesses, they actually received the data as well, because they actually can validate the transaction. But also you want to send to other shards as well, in case this whole shard is failed. But also you want to route the outgoing messages. We actually combine the message routing, data availability and consensus into one process where, let's say, you have, let's say like you withdrew money from my account on shark one, and then you're sending money to Alex with a shark two. So now there's a message going to chart two saying you should debit Alex ten years.

And so now that message is not just a message. It also includes a so called erasure, coded part of the transaction data that this shard was producing. And so kind of this process ensures few things. One is everybody then goes, and when they actually signing and confirming their own information and sending their approval, they also confirm they received the needed chunks, the needed kind of parts from other shards. That also provides us guarantees with this message delivery from shard to shard.

It provides the data ability guarantees, and it's all kind of integrated into the consensus messages that are being sent by validators to each other to actually accumulate the BRC consensus. So it's worth mentioning there's an extra role called block producers. So there's an actual is the blockchain is the near blockchain. So it's not like, because often when people think of sharding, and you know, many sharding blockchains do work this way, people think of multiple chains, right? So every shard is a chain.

Alex Skidanov

It is not the case on near, on near. There's only one blockchain, and there are block producers creating blocks on the chain. And when. But those blocks do not contain the actual transactions, or logically they do, you can think of a block exactly the same way as you would think of a block on ethereum, where it has header consensus information and a bunch of transactions with the difference that while logically transactions are there, physically it only contains information about what we call chunks. So one chunk per shard, or rather up to one chunk per shard.

And physically, the block does not contain those transactions. It just contains the information about the chunks that were produced. And at every particular block, some shards might miss a chunk because there's a particular chunk producer responsible at every particular moment. It could be offline, it could be busy, etcetera. And so a chunk could be missed.

But if the chunk is produced, what happens is that the chunk producer, when they produce a chunk, as Ilya mentioned, they eraser code it and they send a part of it to every block producer. And the block producer would only sign off on the block if they have a part of the chunk that is intended for them. This is where consensus and data availability are mended together. Is that in order to reach a consensus on the block, two thirds of all the block producers waited by stake have to sign off on it. It's a BFT consensus.

So there's no two thirds of signatures. The blockchain we wait until we have, we favor safety over liveness. And correspondingly, if we cannot get two thirds of signatures, we will stall.

And if you have two thirds of signatures, because the block producer would only sign off on the block if they have their part of every chunk in that block, then you know that two thirds of the block producers have, for every chunk included, have their little part. And the erasure code is such that you need one third to reconstruct any chunk. So as long as you believe that no more than one third of the block producers is malicious, then if you have two thirds of signatures, in the worst case, all the malicious actors are included in those two thirds. So you have one third malicious, but you still have one, three, honest. And so you can reconstruct any chunk.

So every chunk is available to everybody, guaranteed. If you have consensus, reach on the block. So data availability and consensus are merged together into a single single mechanic. Right? So, like, this is wicked cool.

Meher Roy

And it's also, like, hard to understand, because this is actually unlike any other system where data availability and consensus are usually like two very separate processes, like whether you go from ethereum to celestia to all of the. But in nearby, it's almost, in essence, it sort of begins with your goals. And then we were going back. So the goal was to have crosshair transactions and generally communication to be with a delay of one block. So effectively, if in a particular block, a transaction initiated and it wants to do something in another shard, we want that to happen with a very high probability in exactly the next block.

Alex Skidanov

If the data availability was separate from consensus, that would be extremely hard to ensure, because we need to be certain that data is available as of the moment when the block is produced, as opposed to that being a separate process. Similarly, Ilya mentioned there are three things which are merged together. Data availability, consensus, and the message passing. So together, the chunk that is now totally available at the moment of consensus being reached also contains the messages that needs to be routed to another shard. And it is designed in such a way that it is ensured that by the time the chunk producer of that other shard is producing the chunk, they not only know that the messages exist, but they also have them.

They can immediately act upon it, and moreover, they have to act upon them, right? So a chunk producer, the chunk would not be valid if the chunk producer did not act on the messages that were sent from another shard. It could be that they don't act upon them immediately because of congestion. Like imagine everybody sending receipts to the same shard, right? So that is automatically handled, right?

It could be that the receipt is not processed immediately, but the receipt is acknowledged immediately and it's put on the queue immediately. And most of the time it is also acted upon immediately because congestion is, most of the time there's no congestion. But it is all part of the same process where we ensure that if something happened in block n and that something wants something else to happen in another short, that something else will very likely happen in n plus one. And maybe, to use Ethereum and roll ups analogy here, that you have optimism producing a block, that block immediately sent to ethereum validators who include it, and as well as they include every other role, let's say we have our optimist, Barbara, trying to send money directly between each other. So both of their blocks need to be included at the same time, immediately in the same Ethereum block to guarantee data availability.

Illia Polosukhin

Because now the kind of like, let's say optimism sends something to arbitram connecting it. But if it's just that, right, then arbitram needs to read out the state from the ethereum, right? Which adds extra latency. And so what happens here is you assume that Optimus and arbitrum are, those sequences are also validators in the network. And so you send them, you kind of optimism.

Arbitrum send their blocks to each other, right? They confirm it. And kind of those stations are now allowed to progress forward the blockchain. And so like as if all the roll ups themselves form the CEO consensus, right? And we're sending information to each other directly.

And in turn, that allows to optimize for latency and for kind of cross shot communication, because everybody talking to each other directly and, but then, you know, sending confirmation to the whole united system. So I'll try to state this in kind of my own understanding of how it works. So it's like we need two different views. One is kind of like the shard view, and then one, we need the global view, because there's a global block. So we went to the shard view pretty well.

Meher Roy

It's like there's a huge set of validators. Some of them are assigned to the shard for a block, and then they'd be reassigned to other shards. And they are kind of like validating the parts of the transactions in the shard. These validators, sometimes they also get the responsibility of being these chunk producers, which are more like long lasting entities. For 12 hours for a particular shard, they are producing, like, okay, for this block, this is the set of transactions, and here are all of the witnesses points.

So these are like long lasting entities. So you could have like a short lasting role, and then you could also get a long lasting role as a validator, right? So that's kind of like the local view of the chart. Now, in the network, there's like a global view where there is actually a single blockchain with like block after block after block, like bitcoin or ethereum. But what that block contains is not the transactions themselves, but the chunks from the different shards that are sort of accepted post consensus as being correct.

Like, okay, so here's block n. It contains chunk, chunk one from this shard, like chunk x from this shard, chunk y from that shard, chunk z from that shard, and so on. And it just contains what chunks the networks is considering like finalized. And so all of the validators are trying to build this blockchain that just contains, contains chunks. And the validators are kind of like signing off on that.

On that block containing chunks, they are trying to add their signatures to it. And their logic is something like, what they are checking is corresponding to each of the chunks that are part of the block. Did I get the slice of data? In order to ensure data availability for the whole network? So as a validator, when I'm signing off on a block, what I'm checking is, okay, these block contains these chunks.

If it contains these chunks, I should have received the data corresponding to these chunks. Do I have it in my hard drive? Yes. Okay, so that's one sort of validation passed, and then I sign, and like, all of the other validators are signing that. So fundamentally, the network is coming to agreement that we have the data that we are going to need to reconstruct any part of the actual transactions in the block in the future.

That's the thing. People are coming to consensus. So in a normal, like, bitcoin like that, the, when a block gets finalized, the network comes to consensus about these transactions. That are processed in near. When a block gets finalized, network is coming to consensus about the validators collectively agreeing that they all have the data needed to reconstruct every part of the block should the need ever arise.

Is that right? Is that intuition right? And they also, like in the world of stateless validation, they also have the information that each of those parts were reconstructed and validated by a set of validators. Right? Right.

Not only that the data is there that can reconstruct everything in every transaction in the chunks, but also the data corresponding to that every transaction with its state witnesses was validated by a certain number of validators of the shards where the state witnesses originated. So it's coming to consensus about data validity. That's, yeah, that's really interesting. It's really cool. Yeah, I mean the big, the big benefit was, and I mean, we do get this question which is like, you know, why video kind of shifted from sharding to roll up architecture?

Illia Polosukhin

And I mean, first of all, obviously not talking for anybody individually or escape is not an agent themselves either. But practically speaking, to design something like this, you need to build everything from ground up. You see how consensus, data availability, message passing and kind of validation correctness all layered in, into one system. And that requires this approach and as well as the VM itself, because VM now needs to be aware that compared to Ethereum EDM, where everything is available, you can always say, hey, give me a state of that other account. Tell me, how much tokens does Alex have in the case of the sharded system like that requires a cross shard message, that requires saying, well, go find where Alex lives and ask him how much tokens he has.

You need to design everything from scratch, from bottom up. With this understanding, given the goal we had, which is ultimate horizontal scaling that is hidden from the users and developers in many cases. For Ethereum, they do not have such a luxury. They have a working system, extremely valuable, extremely integrated everywhere. And so they needed something that is an evolution of their existing system versus a complete rebuild from scratch roll up architecture.

We use it as analogy because a lot of it is similar. ECDM provides data availability and consensus roll ups are able to communicate with each other, but they need to go through Ethereum pretty much to kind of validate and settle before a message can be passed. And so a lot of it is kind of spiritually the same. It's just because of the legacy of like, well, now each VM is a whole separate universe of accounts, right? And so now I am as a user have account on every chain, right.

I don't have a singular balance now that I can use everywhere. Like all of that is kind of a legacy. How do we kind of upgrade the existing system into more scalable platform? So that's kind of really the biggest, obviously, in a way, benefit we had, which was like starting from scratch and kind of designing it with this in mind. Luxury of the clean slate.

Meher Roy

Yeah, luxury of the clean slate is what you had, right, right. So as a validator, sometimes I am taking on this role of being a long lived chunk producer for 12 hours in a particular shard. I am constantly taking the role of validating the state witnesses of a shard, and I'm being assigned from shard to shard to shard. And then every time a kind of like a block is produced in that global main network, what I'm signing off on is corresponding to that block. I should have received some data to back up the data of the state of the entire network.

The data of the entire network. I should have received some small chunk of it. I can identify it, have I received it? And then I not only should I have received data, but I should have also validated some of the witnesses in the global network or in the shard. Have I done that?

And if I've done that, I sign off. And if a majority signs off, that is when near says, okay, we have consensus overall that the data is backed up properly and all parts of the update had witnesses and they were validated by enough validators. And therefore, like this block is correct, and then that's done and then the chain moves on. Yes. Yeah, that's right.

Alex Skidanov

And at the end of the day, all of that complexity boils down to there are checks and balances that people do. But at the end of the day, all you care about as the observer is do I have signatures from the sufficient set to validator? And if I do, I know that they have done all the work, or at least they claim to have done so. But that's a whole set of validators, right? So for them to be corrupted, you need to corrupt like a whole big proof of stake system, which it has its own design.

There are certain design principles that don't allow it to be corrupted without a massive amount of money being lost. And. Yeah, and so that allows also other chance to easily create client clients. So near itself encompasses a lot of processing internally. It could also be a part of a bigger ecosystem because building a light fan for near is a relatively straightforward process.

And near, having general purpose was a machine can run any lite client for any blockchain that allows a light client and that allows you to have two directional bridges that effectively say, as long as the light client of both chains is not compromised and light clients are very hard to compromise, despite the term light client, like, you know, compromising an ethereum light plant or near light client is extremely hard. Right. And so then you say, well, for as long as those are not corrupted, Nier is also part of the bigger ecosystem of more chains. So I guess like every creator, when they make something, they end up having, like there's usually certain element of the system that they wished was better. And then do you have those like for you individually?

Meher Roy

What parts of near are you kind of unsatisfied about in the design? Well, I mean, there's few things that. We'Re still finishing up or like and still in the roadmap, right? I think so we mentioned dynamic resharding, right? So right now, as Alex mentioned, right, this is more of a kind of a governance, technical governance process.

Illia Polosukhin

But the benefit with the stateless validation approach is that because we rotate validators every second now, you can actually change the number of shards for validators very quickly because again, they don't really care how many shards in the network, they don't care which shard, they just receive the block. And so the benefit here, if we have sufficient kind of redundancy on chunk producers, we can split the Chunk producer group and say as of next block, you don't care about half of the shard that you were in. So comparative to where we are now, where we need to, everybody agrees, new client has been run, validators spit it up, reshot. It happens. This can happen literally instantly where everything like, okay, now we split into two shards.

Now I'm ignoring just half of the transactions that I may still receive because people are still routing to me and I'm just processing this half. And then on the junk production side, and then on the validator side, now you're just assigning sampling just from a. Different number for the listener. This is the cell tower splitting into two dynamically or the post office splitting into two. Exactly.

And this can happen immediately, which is huge benefit for the spike in usage where subapplication just went, you know, viral NFT or token claim or whatever. And you can like, hey, let's just pull it out into a separate shard, let it go nuts, and then maybe merge it back in like a day or so. So that's definitely huge kind of benefit debt. And for context, I mean, status validation is something we published in February and I mean, been building last year and should be launching within next few months, and then we can start working on dynamic tree sharding. Now, from my perspective, there is a few things that beyond that that we should be working on.

One is, and how Alex actually worked on some design for this earlier is leaderless chat production. So right now there's one junk producer at a time that is responsible for producing junk. And I mean, they kind of rotate and they randomly assign. But still, if that chunk producer is offline now, we have a gap on that time slot, and you drop in gps, you have higher latency with the users. And so the idea definitely is they can also be deduced, which is something that happens on other networks right now.

Alex Skidanov

Not because we somehow resilient to that. Bad guys chose another network to get those, not ours, but it can happen eventually. So leader, with China, production and consensus is something we need to implement, and we have a pretty clear way of achieving it. So by Lidellas Chunk production. So when you say that, I'm kind of like reminded of algorand, where, where the essential idea is, like when you think of a network, like when you think of like a cosmos network, the blockchain is rolling along.

Meher Roy

N blocks have been produced. Everybody knows who should be producing block n plus one, right? It's publicly known. And if they fail to produce, the network waits and then realizes, fails to produce. And then the network also knows if that guy fails to produce a block, then this other guy should produce the block.

And there's kind of like a, almost a line, a cue made of like who has rights to produce. And so that's probably what you mean when you were saying leader. So if you have a shard, you have a multiple validate, like multiple chunk producers. But which one exactly produces the chunk for this block? It's currently known in near, just like cosmos.

But what you would like is a system where there are n chunk producers, and then when a certain block rolls along, one of the Chunk producer realizes they have some kind of winning lottery ticket and they can produce the chunk. So this is, no, this is not so. You're explaining Algorand and it does not lead. I'm explaining Algorand. So is that division for lead?

Alex Skidanov

No, there's still a leader in what you explained. So the only difference is that the leader is not known in advance. Right. So it partially solves the problem. It's much harder to give those, for example, you cannot give those someone you don't know.

Illia Polosukhin

Right. And so by the time, you know, that they want the lottery ticket, and we actually do. The way we think of it is similar. So in Algorand, you actually don't know in advance that you won a lottery ticket. You know, like, effectively what happens is that everybody looks at their lottery ticket and sees the number and the highest number wins, but you don't know if you have the highest number because you don't know the numbers of others.

Alex Skidanov

If you did and others did, then it would be no different from cosmos than everybody would know, right? So instead you say, well, I have a sufficiently high number for me to believe that I might be the winner. So I will produce the block. I will publish it. Maybe someone else will publish.

And, you know, given my number was 97 out of 100, there's a good chance that I was the highest, right? But maybe there was 98. This approach has a minor problem that still multiple blocks will have to be broadcast, right? Because, like, effectively, either the threshold is too high that every now and then nobody will broadcast a block because nobody want the ticket above a certain threshold, or it's, like sufficiently low that multiple blocks will be broadcast. But in our case, something interesting to think about is that at the end of the day, the chunk will be broadcast to everybody, like a little part of it.

And so everybody will have to receive, but within the network, within the set of the chunk producers, what they can do, they will have to send the chunk in its entirety for validation or with the validators of the shard. So you can think of a system where on the high level, what happens is that every single chunk producer generates a chunk, not just some people who are beyond certain threshold. Everybody produces a chunk, and then everybody sends to every person in the shard. So, like, to every chunk producer, every erasure coded part. And so at the end of the day, the network overhead is not chunk size times number of participants.

It's still proportional to the chunk size.

But now you have as many chances. You have chunk reducers, and then you still do the lottery ticket. Then you reveal your lottery ticket, and if you want, your chunk is accepted. But there's no issue with choosing the threshold and maybe spamming the network with multiple chunks that have to be exchanged in its entirety. So that's the high level idea.

But the high level idea is that now you can literally never a chunk will be skipped, no matter how, you know, like, as long as there's one person you didn't do, the chunk will not be skipped. And it's a slightly, slight, slight improvement over an algorand idea. Where you have a threshold and, but still exchange the whole block. Yeah, this is really, this is really fascinating. So I think the essence of the design being that in bitcoin or in ethereum, when I have a block, I have to forward that entire block to everybody.

Meher Roy

And there's a certain, like, diameter of the network. So I'm a validator, and there's some validator to which I have the worst connection, like there's multiple hops, and that entire block must be now sent across the diameter of the other side to that worst other validator. But in near almost, it's like I have a more efficient microphone or megaphone. So I produced a block, I cut it into lots of pieces, and I only have need to send these small pieces to all of the validators. So there's some validator to which I have the worst connection, but I only have to send a small piece through that worst connection, through that diameter of the network.

And because of that, because, because like this, this broadcast of is only piece, piece wise, and this broadcast is kind of like efficient. You can afford to have a design where in a shard, all of the chunk producers produce a block, they broadcast their pieces, and like, then they compare. Okay, which of these pieces has some highest amount of randomness and that becomes like the canonical chunk for that block. Yes, that'd be a chunk of the block. Yeah.

Alex Skidanov

And it's like depending that person I have the worst connection to, depending on why they need my block. Right. If they just a block producer and they just need to sign off, it's sufficient for them to have just that little piece. They don't need to reconstruct the block or a chunk. But if they do need the whole chunk, it's still more performant than sending a whole chunk through some, like, I sent little pieces, and then that person will collect little pieces from different entities.

So it's still a faster way to propagate information. And I guess the advantage here for us in building that leaderless chunk production is that we already have a concept of this erasure code. We already send small pieces, we already have the mechanic to gather them. So it's much less invasive change than for a network where today blocks are being sent fully. So for them to implement the same feature would be implementing a local new mechanics, while in our case, it's just plugging into existing machinery.

I wanted to touch. So you asked a question, which I think is interesting, and Ilya covered it from a very different perspective of sort of a starting that design wise is not great. And we would like it to be different. I think something I thought a lot from the day we launched Nier is that the accounting model for sharding is quite different. You cannot have flash loans for example, easily because accounts on different shards and everything takes a hop.

We've been trying to solve this problem. We were trying to find a way to have atomic transactions since day one with many different designs. And it's a drawback of sharding. But what's interesting is that I think slowly we're coming to the realization that long term not having sharding is not an option. So in essence the highest throughput blockchains today which are not sharded are getting congested and that it's only that much they can squeeze more.

They work day and night to remove all the sub optimalities but at best they will squeeze out another 20%, let's say 50% that will get congested again. The adoption of blockchain today is a fraction of what we want it to be. To consider the whole ecosystem to be successful, sharding will have to happen. When sharding has to happen, people will have to deal with this disadvantage of this different account model. Nier in this case is positioned extremely well because from day one every application near was built with that account model in mind.

So we have tools, we have understanding. Developers in the ecosystem are used to working in this setup while in the rest of the ecosystem people are still operating in this atomic transactions mindset which they will have to abandon eventually because at some point if their application is to scale and the applications they depend upon are to scale, they will not be able to maintain this atomic guarantees and scale to the usage. So they will have to abandon this mindset. And we're positioned uniquely in the sense that everything that is built on near this rich ecosystem of applications is built in a future proof way. As we create more shards, every application on near gets to take advantage of that.

While for any application that is built on what we call synchronous runtime they will have to rewrite their applications. I think that actually the really interesting part is because near is kind of asynchronous and in the way every contract, every account and every contract is designed to be independent, it actually doesn't matter if that account is on near or not. And so that's why kind of a lot of the chain kind of abstraction ideas becoming because well, it actually doesn't matter for like for ref finance like an AML on Nier, it doesn't matter if the token is on near, on base on Solana. Like you can actually deposit any token and the rest finances will able to handle it. And so like generally speaking, Nir's design that every account, like every contract we count as kind of handling assets that are living on like in a synchronous way.

Illia Polosukhin

Right. And obviously it's good to have them on near because you get like 1 second communication time. So like the latency is very low and you know, the count spaces is nice, but it can be living somewhere else. And if like if we have kind of a message passing way of sending this, then you know, near smart contracts know how to deal with that. And all our standards, like ArC 20 panel standard is designed with callbacks and kind of best passing in mind.

And so again, like compared to right now in evms, they tried to figure out how to do cross layer two communication. The challenge is really lies in like all of the standards. Like imagine trying to send AIRC 20 from one chain to another. Like there's no standard address space or anything that supports that. It's also synchronous.

Like expectation is that that transaction will execute same block, but actually it needs to be scheduled. Message passed, settled, sent somewhere else, revalued, et cetera. So really that's kind of the, it was a very non trivial trade off, right. And we kind of, I would say had a period of time where we're like, did we do a right trade off? Kind of.

But right now, yeah, we've seen like, we're literally seeing the validation of our thesis kind of throughout web three. Yeah. So the point being like something like flash loan, where in the near design, it's hard. It's not that flash. Yeah.

Alex Skidanov

It's more like Iron man loan. Yeah, yeah, it's not that flash. So at some point in 2021 or something like that, it must have seemed that, oh, Ethereum has flash nodes, but near wooden. And that's a problem, or is a perceived problem. But when you look at like Ethereum in 2028, which is Ethereum plus all of these l two s and n three s, you can't have flash nodes across that entire ecosystem, and you can't have flash flows across near.

Meher Roy

So it's fine, right? Like it's a, it's a downside, but. Not really exactly like the modern defi of 2028 will be what people have been building Defi near for past two, three years. And as Ilya mentioned, this entire ecosystem of l two s and l three s will also be part of near ecosystem because of chain abstraction right. Any account on any chain can be with a very thin abstraction layer perceived as a near account, just with a higher latency.

Alex Skidanov

Right. So like a near account to near account will be 1 second always, but near account to optimism account, it's going to be exactly the same protocol on near side, but the latency will be higher because there is communication between them. Yeah, I'm not going to chain abstraction because I feel like we've already covered so much. And I did cover chain abstraction with, in the previous episode with Ilya. So I mean, avoiding that because things start to become already too complex.

Curious listener, can Google near chain abstraction? Yeah, I just wanted to give an understanding why, like, chain obstruction didn't come from nowhere. Chain obstruction was the mentality we took with near when we designed near. It's just like now we expanded that to kind of all chains, but it's still kind of, you know, the thinking we put in into designing near is still there. Now how do we apply it to the whole web?

Illia Polosukhin

Three, again, assume, like, you can think of your optimism just being another shard of near. And this is where actually, like, you know, for example, ZK proofs and Aglaya, what Polygon is working on is all coming together. Because like, if we can unify security, right, if we can provide common security layer, then on top of this, and again we have near Da, we can actually settle DA of other layer twos. Well, now they're actually not that different from other shards of near. I mean, there's differences in sequencing and production, so there's things to handle under the hood.

But again, we can kind of extend our layer of abstraction that we, we provide to users and developers to cover up that and say, actually if there's decay proofs in data availability, we can actually say security is the same. And now we can message pass and we can do other species this way. That's the idea is how do I apply the same methodology and user experience, developer experience, but then expanded back more to the rest of upstream? So, like, from my perspective, you know, when I look at like Ethereum, Ethereum's roadmap, and then like, the near and near roadmap, one of the things that stands out to me is in Ethereum, the roadmap is based on like scaling via these l two s and l three s. But the relationship between ether, the asset, and the core asset of the l two can be synergistic in at times, but it can be non synergistic at other times.

Meher Roy

Right? Like so, so it's like the l two pays the main Ethereum chain for certain services, and the service intended is usually data availability. But it can be the case that the l two generates 100 million in fees, but it only pays 500,000 in fees to the main chain.

This relationship is kind of like great if the l two is kind of building a completely new market that Ethereum never had imagined. I don't know. Some AI decentralized app comes and we had to capitalize on it, built it, they got 100 million in transaction fees and paid 500,000 to Ethereum main chain. It's great for the Ethereum main chain because there's new revenue stream coming. It meant 500,000, but it's new.

The interesting case becomes when some app that was massive, that's like massively popular on the Ethereum main chain, generating millions in fees, ends up thinking it's better I migrate to that l two. And so they might be making like 10 million in fees on the main chain, and then they migrate to the l two and then the fees get cut to a million, and then Ethereum is only making 100,000 on it. So it was making 10 million in fees, and now it's only making 100,000 in fees because the l two ecosystem exists. And they are kind of like the relationship is from the Ethereum ether holders perspective, that isn't so ideal, right? Because you're losing a dapp that might have been cultivated by the Ethereum network over years, and then now it's kind of like migrated away.

And in practice, this has happened with something like dy DX. But what's really cool about Nier is like, this sort of system doesn't exist. Like, the shards, like, the relationship between, like near and the shards is kind of the near token owns the revenues made by all shards, and there is not. In order to scale, it doesn't need to have these complex economic games be present between a main chain and an execution layer, which is very much there in Ethereum. And I believe that this will be, this will become a relevant feature of Ethereum's ecosystem politics in the future, and Nia will just not have any of it.

Cool. So, yeah, I guess we can keep it at that. And it was great to have both of you on the podcast. Maybe we should have another one to discuss on how Alex is planning to use recent developments in the AI technology and what he's building there. I was going to say that it's not a coincidence that AI stands for Alex.

Illia Polosukhin

Amelia. Yeah, I mean, thanks for having us. Obviously, this is a highly technical topic that I think it's been really hard to explain in general. I mean, we've been trying to do this for years now, but the core idea is also, like, it was really hard to prove it out when you just launched because when you just launch, you don't have anything. So there's no users, so there's no need for sharding.

And I think, like, we have that problem in web three where everybody was claiming scale, but until you actually have, like, real world, you know, massive user base to actually transact, you know, just like a general improvement was enough. And I think only in last, probably like three, six months, we've seen on near, for example, multiple kind of million user application launching near right now somewhere between 1.5 to 2 million daily active, which is more than any other blockchain right now. We have more transactions usually than all, they are two combined at least on Sundays. And so like that, that's kind of where this started to prove out. Right.

And for context, we're still under Solana transaction numbers. So Solana counts the consensus transactions. So it's a, I don't know how we compare to the, to the actual number. Yeah. But generally speaking, yeah, like, we have more daily active users than Solana and most of the days Tron, because Tron is kind of second the biggest right out.

But again, like this, this is a point that as we started to see kinesis growth, like, we started to see congestions and sounds of shards. And the idea was that we can just expand capacity without increasing fees versus every other blockchain, including Tron. Actually, example, because the transaction fees went from being very cheap to actually now being like, you know, 30, $0.50 for their users, even though they are running, you know, with like a small subset of validators, you know, a kind of modified in the end chain. So I think like that's kind of where we starting to see these things play out. And obviously, again, it took a while ecosystem to mature, the applications to build and launch as well as for them to gain users.

But now we're starting to see this story really play out. And obviously it's exciting and it's also now is really good time to tell the story and kind of explain how it works. Cool. Then I'll catch you again on Epicenter India. And Alex, thank you.

Meher Roy

Thank you for being there. Thank you. Thank you. Thank you for joining us on this week's episode. We release new episodes every week.

You can find and subscribe to the show on iTunes, Spotify, YouTube, Soundcloud, or wherever you listen to podcasts, and if you have a Google home or Alexa device, you can tell it. To listen to the latest episode of the epicenter podcast, go to Epicenter TV. Subscribe for a full list of places where you can watch and listen. And while you're there, be sure to sign up for the newsletter so you get new episodes in your inbox as they're released. If you want to interact with us guests or other podcast listeners, you can follow us on Twitter and please leave us a review on iTunes.

It helps people find the show, and we're always happy to read them. So thanks so much and we look forward to being back next week.