sui Add `TransactionManager` to execute certified transactions based on input object availability

Background

This change adds a TransactionManager component that manages dependencies between pending certified transactions and objects in their input. TransactionManager accepts certified transactions from various sources e.g. consensus output currently and checkpoint sync in future. It builds dependencies between objects and transactions, and publishes transactions that have all input available locally. The goal of the change is to avoid the inefficiency and instability in the existing execution logic, by guaranteeing availability of input objects before execution, and simplify execution logic.

Transaction Manager

Here is a diagram for how TransactionManager communicates with other components.

transaction_manager drawio

Right now, TransactionManager is only activated in validators. The next step is to use TransactionManager in fullnode as well.

Shared locks

Since TransactionManager requires input objects' versions, shared objects in a certificate must be locked before sending to TransactionManager. Currently, SharedObjectLockNotSetError is often observed in logs, so this is not yet guaranteed. Enforcing this invariant involves a few changes:

Drop the logic to remove shared object locks after a transaction commits. I'm unable to convince myself the existing logic of removing shared locks are correct. And even with 2 and 3 below, I still observed SharedObjectLockNotSetError in tests. Later after dust settles down with the new TransactionManager, we can implement the logic to clear shared locks table at epoch boundaries.
In handle_consensus_transaction(), take shared locks and add certificates to the pending_certificates table in one batch write, then send the certificate to TransactionManager. Otherwise if a certificate is added to the pending queue or TransactionManager first, it can get executed immediately, without taking the shared locks.
Certificates were sent to the pending queue (add_pending_certificates()) quite liberally before. Most of these corner cases are removed to avoid sending a certificate containing shared objects to TransactionManager, if no shared locks are taken for the certificate.

Execution Driver and Node Sync

The new execution driver runs AuthorityState::handle_certificate() to process certificates directly, instead of sending digests to node sync. The driver shares similarities with the execution logic in node sync. Eventually we may want to merge the two. But right now, keeping the validator execution logic simple seems to be more beneficial.

There is no longer a retry path for certificates missing parents, since it is no longer possible. We will rely on other mechanisms (e.g. Narwhal) to catch up on missing certificates.

#4990 #5514

Nov 02 '22 08:11 mwtian

Overall, this is fantastic! To summarize the important points from my comments:

We need to be deliberate about how pending certificates are persisted and when it is ok to forget about them. This code seems to muddy those waters slightly.

I believe TransactionManager should be aware of the unsequenced state and model it explicitly.

I think there are a few data consistency problems here, (deletion of shared locks, notification of object writes).

Thanks for very detailed and informative review! There are indeed issues I missed at the beginning, e.g. pending_execution / pending_certificates table is actually part of the data consistency and recovery guarantee. I moved writing to the pending_certificates table into the same batch write that updates consensus tables and execution index.

For shared locks, I understand it should mostly work. But we are indeed seeing SharedObjectLockNotSetError from logs currently. And even after implementing (2) and (3) from the shared lock section above (still removing shared locks after commit), I still observed the error within TransactionManager::enqueue(). There are code paths that make thinking about the correctness of shared locks difficult. For example in hanele_certificate_and_effects() no transaction recovery lock is taken, so many instances can run in parallel. When effects of the transaction exist, no shared lock is taken regardless of existence of shared locks. It may take a few days to make everything work correctly all the day.

And I don't see the benefit of clearing shared locks right now. They should not occupy as much disk space as some other data. I think cleaning up temporary transaction data can be sequenced to be worked on, after we have a solid reconfiguration implementation. Then some of the storage cleanup work will be trivial as part of the epoch cleanup. Also, I think we need proper rocksdb transaction or our own custom transaction implementation. Otherwise it is really error prone to manage data consistency with multiple reads and writes.

Let me know what you think and we can chat off thread.

Nov 08 '22 16:11 mwtian

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
explorer	✅ Ready (Inspect)	Visit Preview	Nov 9, 2022 at 11:52PM (UTC)

Nov 09 '22 05:11 vercel[bot]

@mystenmark as we discussed yesterday, I looked into how to make sure shared object locks can be written and cleared correctly per transaction, instead of per epoch as I proposed in this PR. I believe we must cover writing and clearing the shared object locks within a per-transaction critical section. I can think of two ways to implement this:

Reuse the transaction lock mechanism we have. Since the lock is a scoped object right now, we need to acquire and pass it through TransactionManager. And we may need to avoid waiting on the lock on critical path, e.g. handle_consensus_transaction().
Simplify transaction locking by using TransactionManager to ensure one execution of a transaction at a time. The existing logic of skipping execution if effects exist still applies within the TransactionManager lock. Transaction recovery will just be retrying pending certificates. This should be correct since we need to clear the pending certificate for each certificate, similar to unlocking the transaction. And this would simplify the transaction logic further.

What do you think @mystenmark @lxfind?

Nov 09 '22 17:11 mwtian

sui sui copied to clipboard

Add `TransactionManager` to execute certified transactions based on input object availability

Background

Transaction Manager

Shared locks

Execution Driver and Node Sync

sui
sui copied to clipboard