namada icon indicating copy to clipboard operation
namada copied to clipboard

Make shielded syncing a separate command

Open batconjurer opened this issue 1 year ago • 4 comments

Describe your changes

Previously, the scanning of MASP notes was done as part of various MASP commands. This moves that logic into a separate command so that it can be done out of band.

Indicate on which release or other PRs this topic is based on

#2363

Checklist before merging to draft

  • [ ] I have added a changelog
  • [x] Git history is in acceptable state

batconjurer avatar Jan 22 '24 15:01 batconjurer

Thanks for these changes, I think I understand the general flow. But maybe I am missing some of the higher level considerations that motivated this approach to separating MASP synchronization into a separate command. Some questions:

* Have alternatives to interrupts been considered? Like continuously saving/appending new transactions to a file?

* Given interrupt usage and SyncStatus modality, will the MASP functionality be easily usable from the SDK on all platforms (including the web)?

* How do the `ShieldedContext` lock guard in the SDK and `SyncStatus` marker type interact? Is there any overlap or redundancy here?

* How do we handle shielded context file consistency when multiple clients (from separate processes) are running?
  • I considered having a daemon process that would do such a thing. I think it's more work and will be more complicated to manage multiple process reading and writing to the same shielded context. It is something I'd like eventually, but Adrian didn't consider it a prioriy.
  • This functionality will only work in the CLI. I will pull the syncing logic up into apps to reflect this. For the web, something bespoke will need to be written
  • The SyncStatus is there so the syncing and non-syncing related functions are actually implemented on separate types. This was especially helpful in making sure I had gotten rid of all the fetch calls in client code. This wasn't caught by the lock guards.
  • I hadn't considered this point because my impression was that the existing Namada trait implementations covered these problems. If that isn't the case, indeed I will need to do more work checking this. In particular, I won't allow more than one process access to a shielded context at a time. This can be relaxed later.

batconjurer avatar Jan 24 '24 09:01 batconjurer

  • I considered having a daemon process that would do such a thing. I think it's more work and will be more complicated to manage multiple process reading and writing to the same shielded context. It is something I'd like eventually, but Adrian didn't consider it a prioriy.

I was thinking that there could be alternatives to interrupts other than daemon processes. Like for instance, we could do things as before but atomically commit the ShieldedContext to file storage (using ShieldedUtils::save) whenever we reach consistent states (like a transaction fetched into the context, or a transaction scanned into the context). This way if Ctrl-C is pressed or there's an OS crash, the last saved state is still preserved. I say this because reasoning about interrupts, when they can occur, how to leave/save the context in a consistent state, how to maintain correctness as more functionality is added to the shielded pool, and how to integrate it into web clients and other contexts may be more challenging than a model without channels and receivers.

murisi avatar Jan 24 '24 16:01 murisi

  • The SyncStatus is there so the syncing and non-syncing related functions are actually implemented on separate types. This was especially helpful in making sure I had gotten rid of all the fetch calls in client code. This wasn't caught by the lock guards.

I see, this is a valid approach. That being said, though it may be undesirable in to do a sync just before displaying balances in our client. I wouldn't say that it is necessarily illogical (to the point of being a type error) for users of the SDK to interleave syncing and non-syncing operations for their own applications.

murisi avatar Jan 24 '24 16:01 murisi

When working with you on the note scanning algorithm, I noticed two things:

  1. The SyncStatus caused friction in our SDK usage when we were trying to save the ShieldedContext. We eventually worked around this issue by cloning the ShieldedContext before saving it (which should not be necessary since Borsh only requires a ShieldedContext reference to do serialization). We also had to introduce a PhantomData<SyncStatus> since the SDK code does not actually depend on the sync status.
  2. We have essentially moved the interruption logic out of the SDK and into the apps crate. This means that the SDK no longer really deals with partiatially completed synchronizations in its interface.

In light of the above, it may be worth reconsidering if and how we do SyncStatus. If we are doing it for aesthetic reasons, then it should be fine. However if we are doing it to enforce correctness, then we should establish (in the absence of interrupts) what sequence of save, fetch, load, and gen_shielded_transfer operations will corrupt the internal state of the ShieldedContext (and render it incorrect/unusable) and therefore require the overhead of the type system to prevent. My general fear is that we may be ossifying the apps crate's latest ShieldedContext usage pattern (synchronization subcommand completely separated from other MASP actions) into the SDK unnecessarily, thereby creating friction for other SDK use cases (for instance a non-interactive program that repeatedly fetches and does useful shielded work).

If the SyncStatus must remain, then we should also consider moving it into the apps crate where such information may be more relevant. Or alternatively, we could move these SyncStatus API changes into a separate PR since this partitioning of ShieldedContext (rather than the creation of a separate syncing command) seems to form the bulk of this PR's diff. Doing this could make reviewing the sync subcommand changes easier. That being said, I'm happy to review whichever approach you decide upon.

murisi avatar Jan 29 '24 14:01 murisi

Closing since this has already been done.

brentstone avatar Apr 05 '24 17:04 brentstone