Understanding the behavior of DEFERRED_SYNC

Open deepakkarki opened this issue 2 years ago • 0 comments

I have a Bookkeeper setup that uses EBS on Kubernetes. To reduce write latencies it was suggested to me on the bookkeeper slack to use the "ack before sync" mode, or the writeFlags.DEFERRED_SYNC flag for the ledger.

But when I enabled the the DEFERRED_SYNC locally, some of my tests started failing. One test in particular that did the following -

1. Create a new ledger
2. Write 100 entries to it
3. Wait for 100 acks
4. Close the ledger
5. Make sure we can read back 100 entires on a (recovery) read

The test was failing the read path, claiming the "ledger was empty". I started debugging, and upon further reading the source code I found out (to the best of my knowledge) it was because the following was happening -

1. The ledgers were being written to and the acks were getting sent - as expected.
2. Once we collected the acks, we closed the ledger. At this point the local state of the ledgerHandle still had the LAC as `-1` (and not 99)
3. Closing makes a metadata update, marking the ledger status as closed and setting LAC = -1
4. When we try to read, reader gets the metadata from ZK, sees that LAC=-1 and assumes the ledger is empty.

Now step 2 was the surprising bit because without the DEFERRED_SYNC flag, the LAC does get updated locally with every ack, so digging further I found it this was happening because of this piece of code -

// In Bookkeeper client - LedgerHandle.java (sendAddSuccessCallbacks)

if (!writeFlags.contains(WriteFlag.DEFERRED_SYNC)) {
        this.lastAddConfirmed = pendingAddsSequenceHead;
}

Looks like the LAC is consciously not being updated for some reason. I tried to look back at the PR that introduced this change and could not find a hint as to why this could be.

But getting some more context I realized doing a ledgerHandle.force() before a close() does fix this issue. But this whole experience did leave a few questions unanswered, I would be grateful if someone here can answer them -

Why does BK not update LAC when DEFERRED_SYNC is set? I guess it is to have some reliability guarantees behind the LAC id? (Even then I think the force() should be implicit when closing a ledger which has DEFERRED_SYNC set)
If the above is for reliability / data sanity reasons - why is there no similar behavior when syncData flag is false? It has pretty similar semantics, and also when syncData is set to false it triggers the same code path as DEFERRED_SYNC.
Also this breaks our tailing (non recovery) reads. Nothing can be tailed until the ledger is closed and updated with a LAC.

I apologize if I have some fundamental misunderstanding of the system - I'm still pretty new to Bookkeeper!

Aug 14 '23 05:08 deepakkarki