bookkeeper icon indicating copy to clipboard operation
bookkeeper copied to clipboard

When a client adds entries synchronously to an opened ledger and a bookie crashes, the client may get stuck.

Open M1eyu2018 opened this issue 1 year ago • 6 comments

BUG REPORT When a client adds entries synchronously to an opened ledger and a bookie crashes, the client may get stuck.

Release My bookkeeper is release 4.14.1, however, Release 4.16.4 can reproduce this bug too.

Describe the bug

When a client adds entries synchronously to an opened ledger and a bookie crashes, the ensemble change for the crashed bookie may be called twice. The first ensemble change is caused by the third failed response of 'Bookie handle was not available'. A moment later, The Second ensemble change is caused by the third failed response of 'Bookie operation timeout'. As the same crashed bookie is replaced twice, in the second time unsetSuccessAndSendWriteRequest can't be called because no bookie is replaced so that successful callback of current adding entry can't be sent and client gets stuck.

Example In this example, a client add 81920 entries for a ledger of 10M with 3-3-2 policy, and the ensemble is (A,B,C). 1、At the beginning,entry#0-#6773 is normally written. 2、When add entry#6774, the bookie A crashes for some reason like power outage or run 'kill -9 bookie A process id'. 3、However, two successful responses are received, so it does not affect the ability to continue adding entry#6774-#11604. 4、Before add entry#11605, the third responses for entry#6774-#11604 come back one after another. As the failed response is 'Bookie handle was not available', the failed bookie A is put into delayedWriteFailedBookies. 5、When add entry#11605, maybeHandleDelayedWriteBookieFailure is called, as delayedWriteFailedBookies is not empty, ensemble change begins. 6、After two successful responses of entry#11605 are received, sendAddSuccessCallbacks is called. However, pendingAddOp.submitCallback is not called until ensemble change finishes. 7、When ensemble change finishes, bookie A is replaced by bookie D. Successful callback of entry#11605 is also sent and adding entry is continue.

So far, the logic is correct. But there will be a problem below.

8、entry#11606-#42623 is normally written to (D,B,C) after ensemble change. 9、Before add entry #42624, the third responses for entry#6774-#11604 which has not come back still come back one after another. But in this time, the failed response is 'Bookie operation timeout', the failed bookie A is put into delayedWriteFailedBookies again. 10、When add entry#42624, maybeHandleDelayedWriteBookieFailure is called, as delayedWriteFailedBookies is not empty, ensemble change begin again. 11、After three successful responses of entry#42624 from (D,B,C) are received, sendAddSuccessCallbacks is called. However, pendingAddOp.submitCallback is not called until ensemble change finishes. 12、In this time, as failed bookie A need to be replaced again, but ensemble has been (D,B,C), so no bookie is replaced. Successful callback of entry#42624 can't be sent as unsetSuccessAndSendWriteRequest is not called. 13、As add entries synchronously, the client gets stuck.

To Reproduce

1、create bookkeeper client 2、open a ledger 3、add entries synchronously 4、kill -9 one bookie process id when add entries 5、the client may get stuck forever

How to fix In my opinion, there are two solutions: 1、After each ensemble change, sendAddSuccessCallbacks must be called, which ensure that the successful callback of current adding entry which is not sent as ensemble change is running can be sent after ensemble change. 2、Before ensemble change begins, check if the failed bookie has not been in current ensemble, if so, skip ensemble change so that successful callback of current adding entry can be sent in function writeComplete normally.

M1eyu2018 avatar Apr 02 '24 09:04 M1eyu2018

PTAL, thanks. @hangc0276 @ivankelly @horizonzy @shoothzj @wenbingshen

thetumbled avatar Apr 02 '24 09:04 thetumbled

Thanks for report, I will check it.

horizonzy avatar Apr 07 '24 07:04 horizonzy

Nice catch!

horizonzy avatar Apr 10 '24 08:04 horizonzy

Nice Catch!

wenbingshen avatar Apr 12 '24 07:04 wenbingshen

Is this similar or related to #4097?

lhotari avatar Apr 16 '24 11:04 lhotari

I meet the same issue, merged this PR and test it, found it does not work. when triggerLoop=false , this PendingAddOp won't be resend, and not "successful callback of current adding entry can be sent in function writeComplete normally", and continue get stuck @M1eyu2018

keyboardbobo avatar Nov 21 '24 06:11 keyboardbobo