taproot-assets icon indicating copy to clipboard operation
taproot-assets copied to clipboard

tapgarden: batch write boundaries and batch state when transitioning from planter to caretaker

Open jharveyb opened this issue 11 months ago • 3 comments

Related to #820.

Tl;dr it's unclear what behavior we want around batch persistence to disk vs. the batch still being the pendingBatch if errors occur before or during finalization.

AFAIU, batch finalization means that no more seedlings can be added to the batch. The batch is still mutated before broadcast to add elements like the anchor tapscript sibling, funding TX, etc.

My understanding of the current behavior:

Pending Batch

  • A batch has state BatchStatePending until FinalizeBatch is called.
  • When new seedlings are added to a batch, we update the batch both in-memory and on disk: https://github.com/lightninglabs/taproot-assets/blob/761ccca181599c40e56206869d57cff32346a983/tapgarden/planter.go#L944

Finalizing, before the Caretaker starts

  • If we fail to handle a finalize argument, like the manual fee rate or tapscript sibling, the pendingBatch is unchanged, and still in state BatchStatePending.
  • The batch state is updated to BatchStateFrozen both in-memory and on disk before starting a caretaker: https://github.com/lightninglabs/taproot-assets/blob/761ccca181599c40e56206869d57cff32346a983/tapgarden/planter.go#L762
  • A copy of the batch is used to start the caretaker.

Finalizing, while Caretaker is progressing to Broadcast

  • The caretaker batch is updated both in-memory and on disk for each state transition of the caretaker.
  • If the caretaker fails before broadcast, we delete the caretaker and clear the pendingBatch held by the planter. The batch is already on disk as frozen, so the user can retry the batch by resubmitting all seedlings manually. https://github.com/lightninglabs/taproot-assets/blob/761ccca181599c40e56206869d57cff32346a983/tapgarden/planter.go#L659

Finalizing, after Broadcast

  • The planter clears its pendingBatch, and the caretaker is solely responsible for handling the batch after TX confirmation. https://github.com/lightninglabs/taproot-assets/blob/761ccca181599c40e56206869d57cff32346a983/tapgarden/planter.go#L680

So, any error before finalization should leave the planter's pendingBatch populated with the state as BatchStatePending. The batch state should be BatchStateFrozen before starting a caretaker. Any caretaker error before batch broadcast should leave the planter's pendingBatch empty, with the previous batch on disk as BatchStateFrozen, or some later batch state before BatchStateBroadcast.

This relates to #820 as it means that failures from the new FundBatch and SealBatch calls should not clear the pending batch, nor modify its batch state. This also means that the batch should be funded and sealed before being frozen and starting a caretaker.

If we agree on that behavior, then we'll need to update the planter tests to match. This involves changing some assumptions about the batch state when funding or fee estimation failures occur, as right now those come from the caretaker after finalization. As part of #820, those would occur before finalization.

jharveyb avatar Mar 11 '24 19:03 jharveyb

Related, right now we still have a batch minting ticker present as an alternate way to start batch finalization. AFAICT this is still active in the default configuration, and also still used in the minter unit tests. This affects some of the assertions in the tests, as finalizing via the ticker does clear the pending batch before the caretaker broadcasts the minting TX successfully.

https://github.com/lightninglabs/taproot-assets/blob/761ccca181599c40e56206869d57cff32346a983/tapgarden/planter.go#L529

Based on previous discussion:

https://github.com/lightninglabs/taproot-assets/pull/492#discussion_r1323663872

We should actually remove the ticker entirely and update the test assumptions to match.

jharveyb avatar Mar 11 '24 19:03 jharveyb

From offline discussion with Oli:

#820 (or a smaller PR) should change the planter s.t. after finalizing and before broadcast, the pending batch should only be cleared if broadcast succeeds. So caretaker error should not trigger clearing of the pending batch.

This makes it easier to retry batch finalization in various error cases, without having to manually resubmit all batch elements. We'll need to test that resubmission, as the batch could be on disk with a state between Frozen and Broadcast (but the failed caretaker will be deleted).

jharveyb avatar Mar 12 '24 17:03 jharveyb

Ticker is removed in #827, write boundaries can be fixed in a follow-up PR.

jharveyb avatar Mar 19 '24 18:03 jharveyb