sui-node: 2-phase tx commit
Attempt at addressing data consistency shortcomings in transaction execution. The loss of the Big Atomic Write resulted in a number of bugs which generally stemmed from intermediate representations of tx output making its way to the WAL as a result of failures during write, and retry attempts failing while trying to re-execute against inconsistent object versions.
The fix is to split transaction execution into two phases: Uncommitted and Executed (with a third, "committed to permanent store" phase implicit). Recovery can then leverage the WAL directly as a source of truth for the object versions that need to be persisted. As a side effect, this eliminates consistency issues with sharding permanent store, as the WAL can be host specific or can be a separate distributed service available to all executors.
I wonder if there is anyway to test this?
I wonder if there is anyway to test this?
@lxfind i have a test that is currently failing in another PR, waiting for this.
The best thing to do would be to see if @amnn has a child-object test case or knows how to write one easily.
The best thing to do would be to see if @amnn has a child-object test case or knows how to write one easily.
As luck would have it, I just landed a test involving child objects, here:
https://github.com/MystenLabs/sui/blob/main/crates/sui-core/src/unit_tests/authority_tests.rs#L2213
(Also the test before it) I did it to exercise revert_state_update, but if you remove that code after the revert call, inclusive, it's just a test that the child object transactions committed successfully.
Alternatively, we have the sui adapter transactional tests for dynamic fields:
https://github.com/MystenLabs/sui/tree/main/crates/sui-adapter-transactional-tests/tests/dynamic_fields