nearcore State Witness size limit

Background

Current State Witness is implicitly limited by gas. In some cases large contributors to the State Witness size are not charged enough gas, which might result in State Witness being too big for the network to distribute it to all validators in time.

Proposed solution

MVP

Limiting State Witness is not required for Stateless Validation MVP/prototype. Also (1) shows that current mainnet receipts result in reasonable State Witness size, so that won't be an issue for prototyping.

Short Term

In the short term (before launching Stateless Validation on mainnet) we need to implement soft limit for the State Witness size on the runtime side (similar to compute costs). See this comment for more details. This would help to protect agains bringing down the network with receipts that are specifically crafted to result in large State Witness.

Long Term

I believe in the long term we need to adjust our gas costs to reflect contributions to the State Witness size. This means introducing back TTN for reads, charging for contract code size for function calls, etc.

Resources

(1) zulip thread with current witness size analysis (2) https://github.com/near/nearcore/issues/9378

Nov 28 '23 11:11 pugachAG

Note from the onboarding discussion - another approach is to add state witness size to the compute costs. It should work good enough for short term and be fairly close to what we want in the long term.

Jan 17 '24 10:01 wacban

It seems that there are three kinds of objects that contribute to state witness size:

Incoming receipts and receipt proofs
New transactions
PartialState produced by executing receipts

We can't really do anything about 1) because there's no global congestion control, which means that the queue of incoming and delayed receipts is unbounded, so the size of source_receipt_proofs is unbounded as well :/ We'll have to live with this until global congestion is implemented.

With 2) the situation is better. We control which transactions get added to a chunk, so we could add a size limit for new transactions. In prepare_transactions there's already a gas limit and a time limit, we can add a similar size limit. Once we added transactions which take up more than X MB, we stop adding new ones. AFAU receipts produced by converting transactions should be rather small, so these receipts shouldn't be a big concern. There's also local congestion control which helps a bit - it stops adding new transactions when the number of delayed receipts gets too high, but it doesn't really limit the size, we need an explicit size limit as well.

We can limit 3) by executing receipts until the PartialState gets too large. TrieRecorder records how much PartialState was produced when executing a receipt, and we can use this information to limit total size of PartialState. The easiest way would be to add a size_limit similar to the gas_limit and compute_limit - once PartialState gets too large, stop processing receipts and move them to the delayed queue: https://github.com/near/nearcore/blob/33b5bd7a753a90588f7ea986e0b85f20c8c800e0/runtime/runtime/src/lib.rs#L1485 I think this would be good enough for normal non-malicious traffic, but this kind of limit isn't enough by itself. In Jakob's analysis he found out that a single receipt can access as much as 36 million trie nodes, which would produce hundreds of megabytes of PartialState. This means that we also need a per-receipt limit - if executing a receipt produces more than X MB of PartialState, then the receipt is invalid and execution failed. Like the 300TGas limit. This will be a breaking change, some contracts that worked before could break after introducing this limit, but I think it's necessary to add it, I don't see any way around it.

There's also the question of what the size limit itself should be - In Jakob's analysis he proposed 45MB, but that requires a significant amount of bandwidth - sending 45MB ChunkStateWitness to 30 validators would require at least 10 Gbit/s connection (!). We've already seen validators start having trouble with 16 MB witnesses, so this limit has to be chosen carefully. The limit also can't be too small, because it'd make the per-receipt size limit very small.

My rough plan of action would be:

Use TrieRecorder to measure how much PartialState each receipt produces. Run some traffic and see what it looks like. Add metrics.
Add a size_limit when applying receipts - add the basic limit which stops processing receipts when the size of PartialState gets too large. This could be enough to run mainnet traffic smoothly.
Add a size limit for new transactions - stop adding transactions when they get too large.
Implement per-receipt size limit on PartialState. This would require careful analysis - it'd be good to go over the blockchain and see if there're any contracts which require > 20MB of PartialState to run. Those could break after introducing the limit, so we must estimate what the impact of that would be, warn developers, etc.
Adjust gas costs to reflect how much PartialState is produced by executing a receipt. Accessing trie nodes should be as expensive as the resulting size increase is.

Feb 29 '24 16:02 jancionear

A quick and hacky size limit example, stops applying receipts when the size of TrieRecorder goes above 5MB: https://github.com/jancionear/nearcore/commit/6dd9d4fa5bd161558e109d7b6943207e2b057a6c

Feb 29 '24 17:02 jancionear

Updating the project thread.

I've merged in PR https://github.com/near/nearcore/pull/10703 which adds a soft limit for storage proof size as highlighted in point 3 of @jancionear comment. The next step I was thinking of pursuing was the hard limit for each contract as per the research work that had been done by Jakob. Based on that I had a conversation with Simonas.

Simonas suggested while this is totally doable, we should definitely consider the consequence of adding this restriction on contracts. Historically we've maintained the stance of having backward compatible contracts and adding this restriction can possibly cause some contracts to fail.

We should probably get some statistics on the size of data touched by contracts and (1) whether there are any existing contracts on mainnet already running that may break and (2) whether there are any historic/dormant contracts that may break.

(1) is easily doable as we can just add metrics to the mirrored mainnet traffic. Marcelo is the right point of contact for this. (2) on the other hand is quite a bit of work, but this too has been done in the past. I'm not personally sure whether the work is worth it for our case.

At the end of the day this also boils down to decisions by upper management and we should definitely keep Bowen in the loop and let him know the proposed changes. That said, we should definitely do our research before going to him. As next steps, I propose we add some metrics like P50, P99, P999, P100 to figure out what's the size of data touched by contracts, whether any contracts would break (probably not).

Technical side of things

runtime/near-vm-runner/src/logic/logic.rs is the file we need to take a look at
Within that storage_read function is the one that runtime uses to interacts with the trie storage and we can probably explore more to track the size of the storage touched and not just the node count.
Later, while implementing the hard limit, we can keep track of this, return a runtime error (or failed contract execution) if the hard limit is hit and charge the gas.
Simonas had mentioned we probably don't have metrics within logic.rs so we may have to expose the aggregated size as a return value from VM.

Mar 08 '24 16:03 shreyan-gupta

cc. @jancionear

Mar 14 '24 16:03 walnut-the-cat

Depending issues:

#10890
#10780
#11019

Apr 10 '24 21:04 walnut-the-cat

nearcore nearcore copied to clipboard

State Witness size limit

Background

Proposed solution

MVP

Short Term

Long Term

Resources

nearcore
nearcore copied to clipboard