snarkVM icon indicating copy to clipboard operation
snarkVM copied to clipboard

[Draft]Tweaks Ledger to prevent race conditions

Open ghostant-1017 opened this issue 2 years ago • 3 comments

Motivation

I believe this modification can help avoid some strange race conditions at the higher level (snarkOS). In snarkOS, it's possible to lock Ledger as needed, for example, during the proposal stage or the synchronization phase. I am making some adaptive modifications to snarkOS and will submit a PR later

Test Plan

(Write your test plan here)

Related PRs

(Link your related PRs here)

ghostant-1017 avatar Nov 22 '23 03:11 ghostant-1017

Looks like a reasonable approach so far :+1:. I'm in favor of aggregating things that get modified together and of having more methods with &mut.

ljedrz avatar Nov 22 '23 09:11 ljedrz

I like the intuition. I would still favor using a lock here over the Context even though the Ledger is lockable currently. As you know, code changes, and I think the penalty of doubly-locking is not significant for such a critical operation. Safety should take priority over a (minor) performance hit. WDYT?

howardwu avatar Nov 24 '23 23:11 howardwu

I like the intuition. I would still favor using a lock here over the Context even though the Ledger is lockable currently. As you know, code changes, and I think the penalty of doubly-locking is not significant for such a critical operation. Safety should take priority over a (minor) performance hit. WDYT?

For snarkOS, I think the lock granularity should be controlled at the Ledger level, similar to using RwLock<Box<dyn LedgerService>>. This approach has several implications:

  • When reading, it ensures that we do not read an inconsistent state between two blocks (with the exception of CurrentCtx, for example, Transaction).
  • When writing, it guarantees that no other threads will access the Ledger apart from the current thread.

In terms of performance considerations: In reality, we only acquire a WriteLock approximately every 5 seconds to call advance_to_next_block. Other threads can concurrently read the Ledger through a ReadLock without significant impact.

While reviewing the code, I often find myself wondering what would happen if we didn’t use a WriteLock for the Ledger and allowed other threads to interact differently. However, I agree with your point that this would bring a lot code changes and need to ensure that the Ledger does not cause deadlocks. Currently, for Arc<dyn LedgerService>, we don’t need any locks since all the locking is handled within snarkVM.

ghostant-1017 avatar Nov 26 '23 12:11 ghostant-1017