kupo icon indicating copy to clipboard operation
kupo copied to clipboard

On rollbacks (lack of idea)

Open waalge opened this issue 11 months ago • 4 comments

Reading https://cardanosolutions.github.io/kupo/#section/Rollbacks-and-chain-forks/How-can-your-application-deal-with-rollbacks

I cannot extract from the text how I handle a utxo, previously reported, that is not present on chain after a rollback.

Say:

  • I sync (request /matches),
  • Kupo responds with a UTxO.
  • There is a rollback and this response content is now no longer on the chain.
  • I sync again, with ETag and If-None-Match set

The response is not a 304. It has a new Etag and X-Most-Recent-Checkpoint and an empty array of utxos (say).

I cannot see how I can tell that my cached utxo does not exist (without also continuously syncing the checkpoints and running implementing this client side). I think I want that in such cases kupo to first respond with maybe a 410 Gone perhaps with a body containing some checkpoints most recent to the X-Most-Recent-Checkpoint.

What am I missing?

waalge avatar Apr 01 '25 08:04 waalge

😳 https://github.com/CardanoSolutions/kupo/issues/87

waalge avatar Apr 01 '25 08:04 waalge

Kupo doesn't keep track of UTxOs that are dropped during a rollback, so there's no way it could tell you that something is gone. Which is why the general strategy is to notify you that something might have changed. Whether or not it impacts your app is for you to decide.

The way I would typically handle this is by indeed, using the ETag + If-None-Match, and on non-304 responses, parse and compare the new response with the old one to figure out which UTxO are gone or newly inserted. Knowing which UTxO are newly inserted is relatively straightforward using bloom-filters: you can construct a bloom-filter of the first response, and test every key of the new set against it upon receiving it; that gives you a definitive no when a UTxO was not in the first set.

Now, the other way around is more complicated and an active research area 😅! It's typically referred to as "set reconciliation" and one approach to it is to use invertible bloom filters (IBF).

There's always the option to keep a hashset/binary-set on the receiving end with all the keys, but that means extra (possibly unbounded) storage. So not ideal unless you know you're dealing with a small number of keys only.

KtorZ avatar Apr 01 '25 09:04 KtorZ

🙈 That escalate quickly!

My question is malformed, and premised on a misunderstanding of what ought to happen on handling an ETag **with If-None-Match.

In our case, the utxos act like an "almost append only" data set. I want "give me new things since X". Kupo could say "Since X there are these new things" or "X does not exist". Currently Kupo does not distinguish between these and just replies with everything if there is anything.

I could use a query with a for a lower time bound, and check overlap (with fancy apparatus or otherwise). Or I could first query the checkpoint by my last slot, and if my last Etag matches the block proceed with lower bound of my last slot. Both seem suboptimal and have "what if". "What if my overlap is too small" "what if between the checkpoint response and match request a rollback happens" and I'd still need to manually handle the "else" case in the second version. If kupo could handle this internally, which it has better context to do, then this wouldn't be my problem.

The behaviour I want is more akin to how the mini protocol works (I think).

waalge avatar Apr 01 '25 10:04 waalge

Maybe If-Range header? This would handle the optimistic case well.

The pessimistic case is still awkward since kupo would still reply 200, with everything following spec. Rather than the preferred error and negotiate with the client where to sync from.

waalge avatar Apr 01 '25 10:04 waalge