go-spacemesh
go-spacemesh copied to clipboard
database: atomic apply and revert
Description
when a layer is applied or reverted, the goroutine spans across three components:
mesh -> conservative state -> vm
two issues:
- inconsistent data: each component updates database independently. should any component error out, the other components don't undo its update.
- database deadlock: each of them can potentially start a database transaction while calling the other that also starts its own database transaction.
i made a proposal to pass the *sql.Tx object in apply/revert goroutines but concerns were raised about making such move without more data like code instrumentation with metrics to track db saturation (e.g how many time we spend in waiting for a write or for a db handle) and some metrics to track request rates.
fixing this issue https://github.com/spacemeshos/go-spacemesh/pull/3258 can also add to the confidence, but there's more work to be done there too.
@dshulyak @pigmej
I still believe that it will be a huge mess if opened transactions will be passed across modules.
What i would do instead is to check why we need to update state in mesh and conservative cache after revert and try to eliminate those updates.
After latest refactoring only results are not saved/reverted atomically with the rest of the state. I am not sure if thats possible (definitely not easily) due to optimistic filtering