bookkeeper
bookkeeper copied to clipboard
Missing ledgers when inducing network packet loss in Bookkeeper 4.15
BUG REPORT
Describe the bug
During our tests of Pravega (https://cncf.pravega.io/), we have detected a regression after upgrading to Bookkeeper 4.15 (https://github.com/pravega/pravega/pull/6676). The regression is consistently reproducible when doing a fault injection test (30% network packet loss). Note that this kind of test consistently pass using Bookkeeper 4.14, for which we believe that is a regression or a change in the behavior of the new Bookkeeper release.
A reproduction of this issue goes as follows:
- Pravega creates a ledger, in this case ledger
3246
:
2022-07-20T14:45:25.706422066+00:00 stdout F 2022-07-20 14:45:25,706 7043990 [ZKC-connect-executor-0-SendThread(zookeeper-client:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply session id: 0x2016bc13e0a001c, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 127,1 replyHeader:: 127,38654707364,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,#426f6f6b69654d65746164617461466f726d617456657273696f6e933affffff8c18210218020ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff12813238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f674964122313360ffffffe2ffffffe5ffffffcdffffff8bffffff81ffffff93ffffffd8ffffff9c10,v{s{31,s{'world,'anyone}}},0 response:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246
2022-07-20T14:45:25.706687124+00:00 stdout F 2022-07-20 14:45:25,706 7043990 [ZKC-connect-executor-0-EventThread] INFO o.a.b.client.LedgerCreateOp - Ensemble: [bookkeeper-bookie-2-11580, bookkeeper-bookie-0-28816] for ledger: 3246
2022-07-20T14:45:25.706725535+00:00 stdout F 2022-07-20 14:45:25,706 7043990 [core-13] INFO i.p.s.s.i.bookkeeper.BookKeeperLog - Log[13]: Created Ledger 3246.
2022-07-20T14:45:47.426112317+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [ZKC-connect-executor-0-SendThread(zookeeper-client:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply session id: 0x2016bc13e0a001c, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 145,4 replyHeader:: 145,38654707423,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,F response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933affffff8c18210218020ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff12813238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f674964122313360ffffffe2ffffffe5ffffffcdffffff8bffffff81ffffff93ffffffd8ffffff9c10,s{38654707364,38654707364,1658328325290,1658328325290,0,0,0,0,172,0,38654707364}
2022-07-20T14:45:47.426207948+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [ZKC-connect-executor-0-EventThread] DEBUG o.a.b.meta.AbstractZkLedgerManager - Registered ledger metadata listener ReadOnlyLedgerHandle(lid = 3246, id = 672259987) on ledger 3246.
- Probably after some network failures induced by the test, we see that this ledger is unregistered by Bookkeeper client (has it been actually removed from Zookeeper?):
2022-07-20T14:45:47.426475848+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [ZKC-connect-executor-0-SendThread(zookeeper-client:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply session id: 0x2016bc13e0a001c, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 146,4 replyHeader:: 146,38654707423,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,T response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933affffff8c18210218020ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff12813238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f674964122313360ffffffe2ffffffe5ffffffcdffffff8bffffff81ffffff93ffffffd8ffffff9c10,s{38654707364,38654707364,1658328325290,1658328325290,0,0,0,0,172,0,38654707364}
2022-07-20T14:45:47.426637438+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [BookKeeperClientWorker-OrderedExecutor-7-0] DEBUG o.a.b.client.PendingReadLacOp - Read LAC complete with enough validResponse for ledger: 3246 LAC: -1
2022-07-20T14:45:47.426737033+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [core-13] DEBUG o.a.b.meta.AbstractZkLedgerManager - Unregistered ledger metadata listener ReadOnlyLedgerHandle(lid = 3246, id = 672259987) on ledger 3246.
2022-07-20T14:46:16.612224510+00:00 stdout F 2022-07-20 14:46:16,612 7094896 [BookKeeperClientWorker-OrderedExecutor-7-0] DEBUG o.a.b.proto.PerChannelBookieClient - Could not write Add request to bookie 192.169.1.60/192.169.1.60:3181 for ledger 3246, entry 0
2022-07-20T14:46:16.612271701+00:00 stdout F 2022-07-20 14:46:16,612 7094896 [BookKeeperClientWorker-OrderedExecutor-7-0] ERROR o.a.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L3246 E0
2022-07-20T14:46:16.616440402+00:00 stdout F 2022-07-20 14:46:16,616 7094900 [BookKeeperClientWorker-OrderedExecutor-7-0] DEBUG o.a.b.proto.PerChannelBookieClient - Could not write ReadLAC request to bookie null for ledger 3246, entry -1
- After that, Pravega still assumes that the ledger is there (as Pravega itself has not deleted it), but any further operation on this ledger fails:
2022-07-20T14:46:16.630253410+00:00 stdout F 2022-07-20 14:46:16,629 7094913 [core-30] DEBUG o.a.bookkeeper.client.LedgerHandle - Failed to close ledger 3246 :
2022-07-20T14:46:16.630795088+00:00 stdout F 2022-07-20 14:46:16,630 7094914 [core-30] ERROR i.p.s.s.i.bookkeeper.BookKeeperLog - Log[13]: Unable to close LedgerHandle for Ledger 3246.
2022-07-20T14:46:16.630795088+00:00 stdout F io.pravega.segmentstore.storage.DurableDataLogException: Unable to close ledger 3246.
- This makes that, when finding the ledger missing, that Segment Container in Pravega is stuck and cannot recover:
2022-07-20T15:59:48.793802385+00:00 stdout F 2022-07-20 15:59:48,793 11507077 [BookKeeperClientWorker-OrderedExecutor-7-0] DEBUG o.a.bookkeeper.client.PendingReadOp - No such entry found on bookie. L3246 E0 bookie: bookkeeper-bookie-0-28816
2022-07-20T15:59:48.793815178+00:00 stdout F 2022-07-20 15:59:48,793 11507077 [BookKeeperClientWorker-OrderedExecutor-7-0] ERROR o.a.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L3246 E0-E0, Sent to [bookkeeper-bookie-0-28816, bookkeeper-bookie-2-11580], Heard from [] : bitset = {}, Error = 'No such ledger exists on Bookies'. First unread entry is (-1, rc = null)
2022-07-20T15:59:48.793906376+00:00 stdout F 2022-07-20 15:59:48,793 11507077 [core-1] INFO i.p.s.s.c.StreamSegmentContainerMetadata - SegmentContainer[13]: Exit RecoveryMode.
...
2022-07-20T15:59:48.794030332+00:00 stdout F 2022-07-20 15:59:48,794 11507078 [core-8] INFO i.p.s.s.r.StreamSegmentReadIndex - ReadIndex[13-123]: Closed.
2022-07-20T15:59:48.794605613+00:00 stdout F 2022-07-20 15:59:48,794 11507078 [core-1] ERROR i.p.s.server.logs.DurableLog - DurableLog[13] Recovery FAILED.
2022-07-20T15:59:48.794605613+00:00 stdout F org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException: No such ledger exists on Bookies
2022-07-20T15:59:48.794605613+00:00 stdout F at org.apache.bookkeeper.client.BKException.create(BKException.java:74)
2022-07-20T15:59:48.794605613+00:00 stdout F at org.apache.bookkeeper.client.PendingReadOp.submitCallback(PendingReadOp.java:654)
2022-07-20T15:59:48.794605613+00:00 stdout F at org.apache.bookkeeper.client.PendingReadOp$LedgerEntryRequest.fail(PendingReadOp.java:176)
2022-07-20T15:59:48.794605613+00:00 stdout F at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.sendNextRead(PendingReadOp.java:397)
2022-07-20T15:59:48.794605613+00:00 stdout F at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.logErrorAndReattemptRead(PendingReadOp.java:440)
- At the Bookies side, we can see that the fence request is received by one Bookie (bookie-2) at
2022-07-20T14:48:11
:
2-bookie-9c88965c-2022-07-20T14-49-58Z.log.gz:2022-07-20T14:48:11.157496389+00:00 stdout F 14:48:11,157 INFO Ledger fence request received for ledger: 3246 from address: /192.168.0.228:57782
2-bookie-9c88965c-2022-07-20T14-49-58Z.log.gz:2022-07-20T14:48:11.160872045+00:00 stdout F 14:48:11,160 INFO Ledger fence request received for ledger: 3246 from address: /192.168.0.228:57782
- And that ledger seems to be present in the metadata of ledgers, at least for some time (look for
L3246
):
2-bookie-9c88965c-2022-07-20T14-55-24Z.log.gz:2022-07-20T14:51:48.716729331+00:00 stdout F 14:51:48,716 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000 finished:false header:: 36287,8 replyHeader:: 36287,38654709412,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000,F response:: v{'L3195,'L3070,'L3071,'L3192,'L2784,'L3078,'L3199,'L3079,'L3076,'L3197,'L3077,'L3198,'L3081,'L0377,'L3089,'L3087,'L0372,'L3052,'L3173,'L3053,'L3295,'L3050,'L3171,'L3172,'L3293,'L3290,'L3170,'L3291,'L0463,'L3058,'L0464,'L3059,'L3056,'L3177,'L3057,'L3178,'L0461,'L3054,'L3175,'L3055,'L3176,'L3297,'L3063,'L3184,'L3064,'L3061,'L3182,'L3062,'L3183,'L3181,'L3069,'L3067,'L3188,'L3068,'L3065,'L3066,'L3310,'L1814,'L3317,'L3203,'L3324,'L3322,'L3202,'L3323,'L3200,'L0734,'L1705,'L0737,'L3209,'L0736,'L3207,'L3328,'L3208,'L3329,'L3205,'L0732,'L3327,'L0739,'L0738,'L3096,'L3094,'L3095,'L3092,'L3093,'L3091,'L0388,'L0702,'L0704,'L0703,'L2327,'L0709,'L0706,'L0705,'L0708,'L0707,'L0390,'L3302,'L2452,'L0713,'L0712,'L3308,'L0714,'L3306,'L3307,'L0711,'L0710,'L3305,'L3115,'L3236,'L3116,'L3237,'L0762,'L3113,'L3234,'L3114,'L3233,'L3230,'L3110,'L3231,'L0768,'L0404,'L0767,'L0769,'L0764,'L3119,'L0763,'L3238,'L0765,'L3118,'L0771,'L3126,'L1983,'L3127,'L3248,'L3245,'L3125,'L3246,'L3122,'L3243,'L3123,'L3244,'L3120,'L3000,'L0418,'L2957,'L3009,'L2955,'L3007,'L3214,'L0740,'L2485,'L3211,'L3330,'L0745,'L3218,'L0741,'L0744,'L3104,'L3105,'L3226,'L3102,'L3103,'L3224,'L3101,'L2130,'L0757,'L0756,'L2816,'L0759,'L0758,'L3108,'L3107,'L3030,'L3151,'L3031,'L3152,'L3273,'L3150,'L3038,'L3159,'L2984,'L3278,'L2982,'L3037,'L3279,'L3155,'L2980,'L3156,'L3277,'L2064,'L3032,'L3153,'L3274,'L3033,'L3154,'L3275,'L2989,'L0444,'L2988,'L0447,'L2985,'L2986,'L2073,'L3041,'L3162,'L3283,'L3042,'L3163,'L3160,'L3281,'L3040,'L3161,'L3280,'L0694,'L0696,'L2992,'L3047,'L3048,'L3169,'L3166,'L3167,'L0692,'L3043,'L3164,'L3165,'L2999,'L2996,'L3130,'L3137,'L3017,'L3014,'L3135,'L3015,'L3136,'L3012,'L3134,'L3010,'L3131,'L3252,'L3132,'L2969,'L2966,'L3018,'L3139,'L3019,'L3140,'L3020,'L3141,'L3148,'L3269,'L0671,'L2610,'L2973,'L3149,'L3025,'L2971,'L3147,'L3024,'L3145,'L3142,'L3143,'L2979,'L2735,'L2974,'L3029}
2-bookie-9c88965c-2022-07-20T14-55-24Z.log.gz:2022-07-20T14:51:48.736027964+00:00 stdout F 14:51:48,735 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 36521,4 replyHeader:: 36521,38654709412,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,F response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933a7c8210218ffffff9f92002833238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f6749641223133600,s{38654707364,38654708391,1658328325290,1658328491164,2,0,0,0,155,0,38654707364}
2-bookie-9c88965c-2022-07-20T14-55-52Z.log.gz:2022-07-20T14:55:21.070507213+00:00 stdout F 14:55:21,070 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000 finished:false header:: 36739,8 replyHeader:: 36739,38654709965,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000,F response:: v{'L3195,'L3070,'L3071,'L3192,'L2784,'L3078,'L3199,'L3079,'L3076,'L3197,'L3077,'L3198,'L3081,'L0377,'L3089,'L3087,'L0372,'L3052,'L3053,'L3295,'L3050,'L3171,'L3172,'L3293,'L3170,'L3291,'L0463,'L3058,'L0464,'L3059,'L3056,'L3177,'L3057,'L3178,'L0461,'L3054,'L3175,'L3055,'L3176,'L3063,'L3184,'L3064,'L3061,'L3182,'L3062,'L3183,'L3181,'L3069,'L3067,'L3188,'L3068,'L3065,'L3066,'L1814,'L3317,'L3203,'L3324,'L3322,'L3202,'L3323,'L3200,'L0734,'L1705,'L0737,'L3209,'L0736,'L3207,'L3208,'L3329,'L3205,'L0732,'L3327,'L0739,'L0738,'L3096,'L3094,'L3095,'L3092,'L3093,'L3091,'L0388,'L0702,'L0704,'L0703,'L2327,'L0709,'L0706,'L0705,'L0708,'L0707,'L0390,'L3302,'L2452,'L0713,'L0712,'L3308,'L0714,'L3306,'L3307,'L0711,'L0710,'L3305,'L3350,'L3115,'L3236,'L3116,'L3237,'L3358,'L0762,'L3113,'L3234,'L3114,'L3353,'L3233,'L3354,'L3230,'L3351,'L3110,'L3231,'L3352,'L0768,'L0404,'L0767,'L0769,'L0764,'L3119,'L0763,'L3238,'L0765,'L3118,'L3360,'L0771,'L3126,'L3368,'L1983,'L3248,'L3245,'L3366,'L3125,'L3246,'L3122,'L3243,'L3364,'L3123,'L3244,'L3365,'L3120,'L3000,'L3363,'L0418,'L2957,'L3009,'L2955,'L3007,'L3214,'L0740,'L2485,'L3211,'L3330,'L0745,'L3218,'L0741,'L0744,'L3338,'L3104,'L3346,'L3105,'L3226,'L3347,'L3102,'L3103,'L3224,'L3345,'L3101,'L2130,'L3341,'L0757,'L0756,'L2816,'L0759,'L0758,'L3108,'L3348,'L3107,'L3349,'L3030,'L3151,'L3031,'L3152,'L3273,'L3150,'L3390,'L3038,'L3159,'L2984,'L3278,'L2982,'L3037,'L3279,'L3155,'L2980,'L3156,'L3277,'L2064,'L3032,'L3153,'L3274,'L3033,'L3154,'L3275,'L2989,'L0444,'L2988,'L0447,'L2985,'L2986,'L2073,'L3041,'L3162,'L3042,'L3163,'L3160,'L3281,'L3040,'L3161,'L3280,'L0694,'L0696,'L2992,'L3047,'L3048,'L3169,'L3166,'L3167,'L0692,'L3043,'L3164,'L3165,'L2999,'L2996,'L3130,'L3372,'L3370,'L3137,'L3379,'L3017,'L3014,'L3135,'L3377,'L3015,'L3136,'L3378,'L3012,'L3375,'L3134,'L3010,'L3131,'L3252,'L3373,'L3132,'L3374,'L2969,'L2966,'L3018,'L3139,'L3019,'L3140,'L3382,'L3020,'L3141,'L3383,'L3380,'L3381,'L3148,'L3269,'L0671,'L2610,'L2973,'L3149,'L3025,'L3388,'L2971,'L3147,'L3389,'L3024,'L3387,'L3142,'L3384,'L3143,'L3385,'L2979,'L2735,'L2974,'L3029}
2-bookie-9c88965c-2022-07-20T14-55-52Z.log.gz:2022-07-20T14:55:21.090055084+00:00 stdout F 14:55:21,090 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 36970,4 replyHeader:: 36970,38654709965,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,F response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933a7c8210218ffffff9f92002833238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f6749641223133600,s{38654707364,38654708391,1658328325290,1658328491164,2,0,0,0,155,0,38654707364}
2-bookie-9c88965c-2022-07-20T14-55-52Z.log.gz:2022-07-20T14:55:21.095978533+00:00 stdout F 14:55:21,095 INFO Following ledgers: [3329, 771, 3076, 3077, 3081, 3338, 2064, 3089, 3346, 3091, 3092, 3348, 1814, 3094, 3350, 2327, 3095, 3351, 3096, 3352, 2073, 3354, 3101, 3102, 3104, 3105, 3107, 3108, 3114, 3115, 3116, 3118, 3119, 2610, 3122, 3123, 3125, 3130, 3131, 3132, 3134, 3135, 3136, 3139, 3140, 3141, 3143, 3147, 3148, 3150, 3151, 3152, 3153, 3155, 3156, 3159, 3160, 3162, 3163, 3164, 3165, 3166, 3167, 3169, 3171, 3172, 3175, 3176, 3177, 3182, 3183, 3184, 372, 3199, 3202, 388, 3205, 390, 3208, 3209, 3211, 2957, 3214, 2452, 3224, 3226, 2971, 3230, 671, 3231, 3233, 3236, 1705, 3243, 2988, 3244, 3245, 3246, 2735, 692, 2485, 694, 2999, 3000, 702, 447, 703, 704, 707, 708, 710, 3014, 712, 713, 3273, 714, 3274, 3277, 3278, 463, 3025, 3281, 3029, 3031, 3032, 3033, 3291, 732, 3037, 734, 3038, 3295, 736, 3040, 737, 3041, 738, 3042, 739, 3043, 740, 741, 3047, 744, 745, 3305, 3050, 3306, 3307, 3052, 3308, 3053, 3054, 3055, 3057, 3058, 756, 3061, 3317, 758, 759, 3063, 3064, 3065, 3066, 763, 3068, 765, 3069, 767, 3071, 3327] of bookie: [bookkeeper-bookie-0-28816] are identified as underreplicated
2-bookie-9c88965c-2022-07-20T15-00-53Z.log.gz:2022-07-20T14:56:19.912265378+00:00 stdout F 14:56:19,912 DEBUG Reading reply session id: 0x1016bac9b630005, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000 finished:false header:: 6,8 replyHeader:: 6,38654710371,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000,F response:: v{'L3195,'L3070,'L3071,'L3192,'L2784,'L3078,'L3199,'L3079,'L3076,'L3197,'L3077,'L3198,'L3081,'L3401,'L0377,'L3400,'L3089,'L3087,'L0372,'L3407,'L3408,'L3405,'L3406,'L3403,'L3404,'L3052,'L3053,'L3295,'L3050,'L3171,'L3172,'L3293,'L3170,'L3291,'L0463,'L3058,'L0464,'L3059,'L3056,'L3177,'L3057,'L3178,'L0461,'L3054,'L3175,'L3055,'L3176,'L3063,'L3184,'L3064,'L3061,'L3182,'L3062,'L3183,'L3181,'L3069,'L3067,'L3188,'L3068,'L3065,'L3066,'L1814,'L3317,'L3203,'L3324,'L3322,'L3202,'L3323,'L3200,'L0734,'L1705,'L0737,'L3209,'L0736,'L3207,'L3208,'L3205,'L0732,'L3327,'L0739,'L0738,'L3096,'L3094,'L3095,'L3092,'L3093,'L3091,'L3412,'L3413,'L0388,'L3410,'L3411,'L0702,'L0704,'L3418,'L0703,'L3419,'L2327,'L3416,'L3417,'L3414,'L3415,'L0709,'L0706,'L0705,'L0708,'L0707,'L0390,'L3302,'L3423,'L3424,'L3421,'L3422,'L2452,'L3420,'L0713,'L0712,'L3308,'L0714,'L3306,'L3307,'L0711,'L3425,'L0710,'L3305,'L3426,'L3350,'L3115,'L3236,'L3116,'L3237,'L3358,'L0762,'L3113,'L3234,'L3114,'L3353,'L3233,'L3354,'L3230,'L3351,'L3110,'L3231,'L3352,'L0768,'L0404,'L0767,'L0769,'L0764,'L3119,'L0763,'L3238,'L0765,'L3118,'L3360,'L0771,'L3126,'L3368,'L1983,'L3248,'L3245,'L3366,'L3125,'L3246,'L3122,'L3243,'L3364,'L3123,'L3244,'L3365,'L3120,'L3000,'L3363,'L0418,'L2957,'L3009,'L2955,'L3007,'L3214,'L0740,'L2485,'L3211,'L3330,'L0745,'L3218,'L0741,'L0744,'L3338,'L3104,'L3346,'L3105,'L3226,'L3347,'L3102,'L3103,'L3224,'L3345,'L3101,'L2130,'L3341,'L0757,'L0756,'L2816,'L0759,'L0758,'L3108,'L3348,'L3107,'L3349,'L3030,'L3151,'L3393,'L3031,'L3152,'L3273,'L3391,'L3150,'L3392,'L3038,'L3159,'L2984,'L3278,'L3399,'L2982,'L3037,'L3279,'L3155,'L2980,'L3156,'L3277,'L3398,'L2064,'L3032,'L3153,'L3274,'L3395,'L3033,'L3154,'L3275,'L3396,'L2989,'L0444,'L2988,'L0447,'L2985,'L2986,'L2073,'L3041,'L3162,'L3042,'L3163,'L3160,'L3281,'L3040,'L3161,'L3280,'L0694,'L0696,'L2992,'L3047,'L3048,'L3169,'L3166,'L3167,'L0692,'L3043,'L3164,'L3165,'L2999,'L2996,'L3130,'L3372,'L3137,'L3017,'L3014,'L3135,'L3015,'L3136,'L3378,'L3012,'L3134,'L3010,'L3131,'L3252,'L3132,'L3374,'L2969,'L2966,'L3018,'L3139,'L3019,'L3140,'L3382,'L3020,'L3141,'L3148,'L3269,'L0671,'L2610,'L2973,'L3149,'L3025,'L3388,'L2971,'L3147,'L3389,'L3024,'L3142,'L3384,'L3143,'L3385,'L2979,'L2735,'L2974,'L3029}
2-bookie-9c88965c-2022-07-20T15-00-53Z.log.gz:2022-07-20T14:56:19.932194546+00:00 stdout F 14:56:19,932 DEBUG Reading reply session id: 0x1016bac9b630005, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000 finished:false header:: 12,8 replyHeader:: 12,38654710371,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000,F response:: v{'L3195,'L3070,'L3071,'L3192,'L2784,'L3078,'L3199,'L3079,'L3076,'L3197,'L3077,'L3198,'L3081,'L3401,'L0377,'L3400,'L3089,'L3087,'L0372,'L3407,'L3408,'L3405,'L3406,'L3403,'L3404,'L3052,'L3053,'L3295,'L3050,'L3171,'L3172,'L3293,'L3170,'L3291,'L0463,'L3058,'L0464,'L3059,'L3056,'L3177,'L3057,'L3178,'L0461,'L3054,'L3175,'L3055,'L3176,'L3063,'L3184,'L3064,'L3061,'L3182,'L3062,'L3183,'L3181,'L3069,'L3067,'L3188,'L3068,'L3065,'L3066,'L1814,'L3317,'L3203,'L3324,'L3322,'L3202,'L3323,'L3200,'L0734,'L1705,'L0737,'L3209,'L0736,'L3207,'L3208,'L3205,'L0732,'L3327,'L0739,'L0738,'L3096,'L3094,'L3095,'L3092,'L3093,'L3091,'L3412,'L3413,'L0388,'L3410,'L3411,'L0702,'L0704,'L3418,'L0703,'L3419,'L2327,'L3416,'L3417,'L3414,'L3415,'L0709,'L0706,'L0705,'L0708,'L0707,'L0390,'L3302,'L3423,'L3424,'L3421,'L3422,'L2452,'L3420,'L0713,'L0712,'L3308,'L0714,'L3306,'L3307,'L0711,'L3425,'L0710,'L3305,'L3426,'L3350,'L3115,'L3236,'L3116,'L3237,'L3358,'L0762,'L3113,'L3234,'L3114,'L3353,'L3233,'L3354,'L3230,'L3351,'L3110,'L3231,'L3352,'L0768,'L0404,'L0767,'L0769,'L0764,'L3119,'L0763,'L3238,'L0765,'L3118,'L3360,'L0771,'L3126,'L3368,'L1983,'L3248,'L3245,'L3366,'L3125,'L3246,'L3122,'L3243,'L3364,'L3123,'L3244,'L3365,'L3120,'L3000,'L3363,'L0418,'L2957,'L3009,'L2955,'L3007,'L3214,'L0740,'L2485,'L3211,'L3330,'L0745,'L3218,'L0741,'L0744,'L3338,'L3104,'L3346,'L3105,'L3226,'L3347,'L3102,'L3103,'L3224,'L3345,'L3101,'L2130,'L3341,'L0757,'L0756,'L2816,'L0759,'L0758,'L3108,'L3348,'L3107,'L3349,'L3030,'L3151,'L3393,'L3031,'L3152,'L3273,'L3391,'L3150,'L3392,'L3038,'L3159,'L2984,'L3278,'L3399,'L2982,'L3037,'L3279,'L3155,'L2980,'L3156,'L3277,'L3398,'L2064,'L3032,'L3153,'L3274,'L3395,'L3033,'L3154,'L3275,'L3396,'L2989,'L0444,'L2988,'L0447,'L2985,'L2986,'L2073,'L3041,'L3162,'L3042,'L3163,'L3160,'L3281,'L3040,'L3161,'L3280,'L0694,'L0696,'L2992,'L3047,'L3048,'L3169,'L3166,'L3167,'L0692,'L3043,'L3164,'L3165,'L2999,'L2996,'L3130,'L3372,'L3137,'L3017,'L3014,'L3135,'L3015,'L3136,'L3378,'L3012,'L3134,'L3010,'L3131,'L3252,'L3132,'L3374,'L2969,'L2966,'L3018,'L3139,'L3019,'L3140,'L3382,'L3020,'L3141,'L3148,'L3269,'L0671,'L2610,'L2973,'L3149,'L3025,'L3388,'L2971,'L3147,'L3389,'L3024,'L3142,'L3384,'L3143,'L3385,'L2979,'L2735,'L2974,'L3029}
2-bookie-9c88965c-2022-07-20T15-00-53Z.log.gz:2022-07-20T14:56:32.051382550+00:00 stdout F 14:56:32,051 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 40548,4 replyHeader:: 40548,38654710417,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,F response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933a7c8210218ffffff9f92002833238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f6749641223133600,s{38654707364,38654708391,1658328325290,1658328491164,2,0,0,0,155,0,38654707364}
- But the ledger does not exist when we attempt to read from it:
0-bookie-adfe5832-2022-07-20T18-23-58Z.log.gz:2022-07-20T15:59:48.625362817+00:00 stdout F 15:59:48,625 INFO No ledger found while reading entry: 0 from ledger: 3246
2-bookie-9c88965c-2022-07-20T17-26-08Z.log.gz:2022-07-20T15:59:48.624934442+00:00 stdout F 15:59:48,624 INFO No ledger found while reading entry: 0 from ledger: 3246
...
2-bookie-9c88965c-2022-07-20T17-26-08Z.log.gz:2022-07-20T16:39:42.753841016+00:00 stdout F 16:39:42,753 INFO No ledger found while reading entry: 0 from ledger: 3246
In my understanding, from the Pravega point of view ledger 3246
is created and not explicitly deleted by Pravega itself, which means that there has been a data loss, most likely induced by some corner case exposed via network packet loss. In normal conditions, where there are no induced packet loss, things work fine. But it does not mean that, in a real cluster, a similar network failure could happen and the same issue could be reproduced. Again, we have been doing fault injection tests with Bookkeeper 4.14 and probably previous versions as well (e.g., Bookkeeper 4.11), and this is the first time we encounter this behavior.
To Reproduce
Steps to reproduce the behavior:
- Write to Bookkeeper using Pravega (https://github.com/pravega/pravega).
- Introduce 30% network packet loss with LitmusChaos (see https://www.youtube.com/watch?v=EU_g8cKa1G8 for a demo on how to reproduce this experiment)
- Consistently, Pravega gets a
No Such Ledger Exception
in some containers using Bookkeeper 4.15 (with Bookkeeper 4.14, the system is able to work under such conditions)
Expected behavior
Bookkeeper 4.15 should offer a similar behavior in how ledgers are managed compared to Bookkeeper 4.14 (and previous versions).
Screenshots
n/a
Additional context
At the moment, there are 2 changes that could be related to this issue (but I cannot confirm that they actually cause the problem):
- Upgrade of Zookeeper library from 3.6.x to 3.8.x.
- This PR in 4.15 and not in 4.14 seems related to how ledgers are managed: https://github.com/apache/bookkeeper/pull/3361
A couple of questions:
- is the ledger znode present on ZK ?
- which are your replication parameters: ES, WQ, AQ ?
- how many bookies do you have ?
another question:
- how many entries are supposed to be in the ledger ? only 1 ?
Thanks @eolivelli, reposting the answer in Slack for better visibility: https://pravega-io.slack.com/archives/C0151LSM46L/p1662115749173509
is the ledger znode present on ZK ?
The cluster is not in state anymore, but if the Bookkeeper Client gets a BKException$BKNoSuchLedgerExistsException
, my understanding is that it should not be there. However, I see that the ledger 3246
is created at 2022-07-20 14:45:25,706
and then we can see in Bookie logs some queries on these ledger metadata ZNodes:
2-bookie-9c88965c-2022-07-20T14-55-24Z.log.gz:2022-07-20T14:51:48.716729331+00:00 stdout F 14:51:48,716 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000 finished:false header:: 36287,8 replyHeader:: 36287,38654709412,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000,F response:: v{'L3195,'L3070,'L3071,'L3192,'L2784,'L3078,'L3199,'L3079,'L3076,'L3197,'L3077,'L3198,'L3081,'L0377,'L3089,'L3087,'L0372,'L3052,'L3173,'L3053,'L3295,'L3050,'L3171,'L3172,'L3293,'L3290,'L3170,'L3291,'L0463,'L3058,'L0464,'L3059,'L3056,'L3177,'L3057,'L3178,'L0461,'L3054,'L3175,'L3055,'L3176,'L3297,'L3063,'L3184,'L3064,'L3061,'L3182,'L3062,'L3183,'L3181,'L3069,'L3067,'L3188,'L3068,'L3065,'L3066,'L3310,'L1814,'L3317,'L3203,'L3324,'L3322,'L3202,'L3323,'L3200,'L0734,'L1705,'L0737,'L3209,'L0736,'L3207,'L3328,'L3208,'L3329,'L3205,'L0732,'L3327,'L0739,'L0738,'L3096,'L3094,'L3095,'L3092,'L3093,'L3091,'L0388,'L0702,'L0704,'L0703,'L2327,'L0709,'L0706,'L0705,'L0708,'L0707,'L0390,'L3302,'L2452,'L0713,'L0712,'L3308,'L0714,'L3306,'L3307,'L0711,'L0710,'L3305,'L3115,'L3236,'L3116,'L3237,'L0762,'L3113,'L3234,'L3114,'L3233,'L3230,'L3110,'L3231,'L0768,'L0404,'L0767,'L0769,'L0764,'L3119,'L0763,'L3238,'L0765,'L3118,'L0771,'L3126,'L1983,'L3127,'L3248,'L3245,'L3125,'L3246,'L3122,'L3243,'L3123,'L3244,'L3120,'L3000,'L0418,'L2957,'L3009,'L2955,'L3007,'L3214,'L0740,'L2485,'L3211,'L3330,'L0745,'L3218,'L0741,'L0744,'L3104,'L3105,'L3226,'L3102,'L3103,'L3224,'L3101,'L2130,'L0757,'L0756,'L2816,'L0759,'L0758,'L3108,'L3107,'L3030,'L3151,'L3031,'L3152,'L3273,'L3150,'L3038,'L3159,'L2984,'L3278,'L2982,'L3037,'L3279,'L3155,'L2980,'L3156,'L3277,'L2064,'L3032,'L3153,'L3274,'L3033,'L3154,'L3275,'L2989,'L0444,'L2988,'L0447,'L2985,'L2986,'L2073,'L3041,'L3162,'L3283,'L3042,'L3163,'L3160,'L3281,'L3040,'L3161,'L3280,'L0694,'L0696,'L2992,'L3047,'L3048,'L3169,'L3166,'L3167,'L0692,'L3043,'L3164,'L3165,'L2999,'L2996,'L3130,'L3137,'L3017,'L3014,'L3135,'L3015,'L3136,'L3012,'L3134,'L3010,'L3131,'L3252,'L3132,'L2969,'L2966,'L3018,'L3139,'L3019,'L3140,'L3020,'L3141,'L3148,'L3269,'L0671,'L2610,'L2973,'L3149,'L3025,'L2971,'L3147,'L3024,'L3145,'L3142,'L3143,'L2979,'L2735,'L2974,'L3029}
2-bookie-9c88965c-2022-07-20T14-55-24Z.log.gz:2022-07-20T14:51:48.736027964+00:00 stdout F 14:51:48,735 DEBUG Reading reply session id: 0x3016baeab7f000f, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 36521,4 replyHeader:: 36521,38654709412,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,F response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933a7c8210218ffffff9f92002833238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f6749641223133600,s{38654707364,38654708391,1658328325290,1658328491164,2,0,0,0,155,0,38654707364}
I don't know if the fact that this request queries /pravega/pravega/bookkeeper/ledgers/00/0000/L3246
and gets some actual payload in response means that it actually exists in Zookeeper.
which are your replication parameters: ES, WQ, AQ ?
In this experiment, the configuration is ensembleSize=2, writeQuorumSize=2, ackQuorumSize=2
.
how many bookies do you have ?
The Bookkeeper service is configured to keep 4 Bookies.
how many entries are supposed to be in the ledger ? only 1 ?
Pravega rolls over ledger when they reach their max size (1GB default) or when there is some container recovery (due to fencing). So, this particular ledger was created but contained 0 entries at the moment when the issue happened. It would have contained many more in case the issue wouldn't had appeared.
at a first glance it looks to me that N bookies are answering that they do not know the ledger
Could be, according to this log, the ledger should be created using 2 Bookies:
2022-07-20T14:45:25.706687124+00:00 stdout F 2022-07-20 14:45:25,706 7043990 [ZKC-connect-executor-0-EventThread] INFO o.a.b.client.LedgerCreateOp - Ensemble: [bookkeeper-bookie-2-11580, bookkeeper-bookie-0-28816] for ledger: 3246
But in the Bookie logs, I mainly see 2-bookie-9c88965c
with logs related to ledger 3246
. The test inducing network drops may be exposing this potential issue that is hard to discover otherwise.
Not sure if it has something to do with this pr: ISSUE 3220-Autorecovery does not process underreplicated empty ledgers https://github.com/apache/bookkeeper/pull/3239 If network packet loss causes a bookie to be determined to be unavailable. the ledgers on this unavailable bookie will become underreplicated and this ledgers will be closed by the autorecovery service.
We found the same issue on the Pulsar side. https://github.com/apache/pulsar/issues/17520 We tested bk 4.15.0, and the unit test passed, but for bk 4.15.1, the unit test failed. I revert PR https://github.com/apache/bookkeeper/pull/3239 in my local env, the unit test still failed with another exception.
2022-09-08T10:57:49,000 - ERROR - [ForkJoinPool.commonPool-worker-6:ServerCnx@1312] - Try add schema failed, remote address /127.0.0.1:65225, topic persistent://pulsar/ns/testReplicationWithSchema-54c4f655-5443-49d0-8ef2-c65eb8a70c74, producerId 1
java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger - ledger=3 - operation=Failed to open ledger
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:332) ~[?:?]
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:347) ~[?:?]
at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1141) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
at org.apache.pulsar.broker.service.schema.BookkeeperSchemaStorage.lambda$openLedger$40(BookkeeperSchemaStorage.java:596) ~[classes/:?]
at org.apache.bookkeeper.client.LedgerOpenOp.openComplete(LedgerOpenOp.java:248) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.client.LedgerOpenOp$1.safeOperationComplete(LedgerOpenOp.java:216) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.client.LedgerOpenOp$1.safeOperationComplete(LedgerOpenOp.java:206) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.util.OrderedGenericCallback.operationComplete(OrderedGenericCallback.java:62) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.proto.BookkeeperInternalCallbacks$TimedGenericCallback.operationComplete(BookkeeperInternalCallbacks.java:189) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.client.ReadOnlyLedgerHandle.lambda$recover$5(ReadOnlyLedgerHandle.java:295) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
at org.apache.bookkeeper.client.LedgerRecoveryOp.submitCallback(LedgerRecoveryOp.java:159) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.client.LedgerRecoveryOp.access$100(LedgerRecoveryOp.java:37) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.client.LedgerRecoveryOp$1.readLastConfirmedDataComplete(LedgerRecoveryOp.java:136) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.client.ReadLastConfirmedOp.readEntryComplete(ReadLastConfirmedOp.java:155) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion$1.readEntryComplete(PerChannelBookieClient.java:1880) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.lambda$errorOut$0(PerChannelBookieClient.java:1896) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.proto.PerChannelBookieClient$CompletionValue$1.safeRun(PerChannelBookieClient.java:1689) ~[bookkeeper-server-4.15.1.jar:4.15.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) ~[bookkeeper-common-4.15.1.jar:4.15.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.77.Final.jar:4.1.77.Final]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: org.apache.pulsar.broker.service.schema.exceptions.SchemaException: Error while recovering ledger - ledger=3 - operation=Failed to open ledger
at org.apache.pulsar.broker.service.schema.BookkeeperSchemaStorage.bkException(BookkeeperSchemaStorage.java:702) ~[classes/:?]
... 23 more
@eolivelli @zhaohaidao @hangc0276 We have done another set of failure injection tests to narrow down the issue. These are the results:
- Bookkeeper 4.14 server and Bookkeeper 4.14 client: Tests pass
- Bookkeeper 4.15 server and Bookkeeper 4.14 client: Tests pass
- Bookkeeper 4.15 server and Bookkeeper 4.15 client: Issue reproduced
- Bookkeeper 4.14 server and Bookkeeper 4.15 client: Issue reproduced
With this, I think we can narrow down the regression to be in the Bookkeeper client only in Bookkeeper 4.15. Hope this helps us to shortlist the set of PRs that can be related to this problem.
I have done some testing on master branch.
I have created some tests that simulate a "timeout" while performing recovery operations. And in every case I tests the BK client returns BKLedgerRecoveryException
After re-reading again the logs in this issue it seems that:
- the LAC on the ledger 3246 is at least 0 but no bookie has the entry
- in the same process (assuming that the logs come from the same JVM) we have a ReadOnlyLedgerHandle on ledger 3246 but we are also writing to the ledger, so we have a WriteHandle
I did a "git diff" between branch-4.14
and master
branch on the bookkeeper-server/src/main/java/org/apache/bookkeeper/client
directory and I can't find any change that can be related to this problem.
@RaulGracia do you have "auto-recovery" running into your cluster?
Not sure if it has something to do with this pr: ISSUE 3220-Autorecovery does not process underreplicated empty ledgers #3239 If network packet loss causes a bookie to be determined to be unavailable. the ledgers on this unavailable bookie will become underreplicated and this ledgers will be closed by the autorecovery service.
@zhaohaidao your idea is interesting.
The fact that the client tries to read from entry 0 even if the ledger is empty means that something CLOSED the ledger and set the LAC to 0 (and not -1).
I am saying that LAC should be at least 0 because of these logs:
2022-07-20T15:59:48.794605613+00:00 stdout F 2022-07-20 15:59:48,794 11507078 [core-1] ERROR i.p.s.server.logs.DurableLog - DurableLog[13] Recovery FAILED.
It seems (please @RaulGracia confirm) that DurableLog is scanning the ledger, and it receives an error during a "readEntries" operation.
readEntries
fails fast if you try to read after the local value of LAC, but here we see that the BK client is trying to perform the PendingReadOp and so locally LAC is >= 0
it is also interesting the the Writer tried to write entry 0
Could not write Add request to bookie 192.169.1.60/192.169.1.60:3181 for ledger 3246, entry 0
but it wasn't able to even send the write request. That should lead to a "ensemble change", and to choose a new bookie. If you have 4 bookies then we should see something more in the logs and the operation should move forward and the BK client should have tried to write the same entry to another set of bookies. In the LedgerMetadata we should have multiple ensembles.
@RaulGracia I saw the logs in step 2:
2022-07-20T14:45:47.426475848+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [ZKC-connect-executor-0-SendThread(zookeeper-client:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Reading reply session id: 0x2016bc13e0a001c, packet:: clientPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 serverPath:/pravega/pravega/bookkeeper/ledgers/00/0000/L3246 finished:false header:: 146,4 replyHeader:: 146,38654707423,0 request:: '/pravega/pravega/bookkeeper/ledgers/00/0000/L3246,T response:: #426f6f6b69654d65746164617461466f726d617456657273696f6e933affffff8c18210218020ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff12813238a19626f6f6b6b65657065722d626f6f6b69652d322d3131353830a19626f6f6b6b65657065722d626f6f6b69652d302d32383831361003834204825a16ab6170706c69636174696f6e127507261766567615a15af426f6f6b4b65657065724c6f674964122313360ffffffe2ffffffe5ffffffcdffffff8bffffff81ffffff93ffffffd8ffffff9c10,s{38654707364,38654707364,1658328325290,1658328325290,0,0,0,0,172,0,38654707364}
2022-07-20T14:45:47.426637438+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [BookKeeperClientWorker-OrderedExecutor-7-0] DEBUG o.a.b.client.PendingReadLacOp - Read LAC complete with enough validResponse for ledger: 3246 LAC: -1
2022-07-20T14:45:47.426737033+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [core-13] DEBUG o.a.b.meta.AbstractZkLedgerManager - Unregistered ledger metadata listener ReadOnlyLedgerHandle(lid = 3246, id = 672259987) on ledger 3246.
2022-07-20T14:46:16.612224510+00:00 stdout F 2022-07-20 14:46:16,612 7094896 [BookKeeperClientWorker-OrderedExecutor-7-0] DEBUG o.a.b.proto.PerChannelBookieClient - Could not write Add request to bookie 192.169.1.60/192.169.1.60:3181 for ledger 3246, entry 0
2022-07-20T14:46:16.612271701+00:00 stdout F 2022-07-20 14:46:16,612 7094896 [BookKeeperClientWorker-OrderedExecutor-7-0] ERROR o.a.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L3246 E0
2022-07-20T14:46:16.616440402+00:00 stdout F 2022-07-20 14:46:16,616 7094900 [BookKeeperClientWorker-OrderedExecutor-7-0] DEBUG o.a.b.proto.PerChannelBookieClient - Could not write ReadLAC request to bookie null for ledger 3246, entry -1
There have two things I noticed:
- The ledger seems closed after it received the ReadLAC response.
Read LAC complete with enough validResponse for ledger: 3246 LAC: -1
2022-07-20T14:45:47.426737033+00:00 stdout F 2022-07-20 14:45:47,426 7065710 [core-13] DEBUG o.a.b.meta.AbstractZkLedgerManager - Unregistered ledger metadata listener ReadOnlyLedgerHandle(lid = 3246, id = 672259987) on ledger 3246
- Bookie channel is null in the callback. That looks like when the callback is triggered, the channel is close.
Could not write ReadLAC request to bookie null for ledger 3246, entry -1
And looks like in the Pravega, the ledger handler closes again so the error happens again. (step 2 closed and step 3 closed again)
2022-07-20T14:46:16.630253410+00:00 stdout F 2022-07-20 14:46:16,629 7094913 [core-30] DEBUG o.a.bookkeeper.client.LedgerHandle - Failed to close ledger 3246 :
2022-07-20T14:46:16.630795088+00:00 stdout F 2022-07-20 14:46:16,630 7094914 [core-30] ERROR i.p.s.s.i.bookkeeper.BookKeeperLog - Log[13]: Unable to close LedgerHandle for Ledger 3246.
2022-07-20T14:46:16.630795088+00:00 stdout F io.pravega.segmentstore.storage.DurableDataLogException: Unable to close ledger 3246.
I saw in the previous, you are using 4.14.1, I found there has a change related to this which was released in 4.14.3.
Could you please also run the test against with the 4.14.1 and 4.14.3?
cc @hangc0276 @eolivelli
@zymap thanks a lot for your analysis. Yes, you are right, we are using 4.14.2 concretely, as the last commit we have from mainline is . And we do not have that PR #2794 when Bookkeeper works well with our fault injection tests. Maybe we could try another test with latest 4.14 commit, to see if the same issue is reproduced.
@RaulGracia is there any chance that lossy network caused some kind of split brain or data inconsistency on ZK side? e.g. I recently dealt with the situation when out of 5 ZK nodes one had missed specific node, so the clients that were lucky enough to read form that node didn't see the data. Can you browse the BK metadata using zkCli on all zk nodes to confirm that metadata for the missing ledger is either present (and the same) or missing on all nodes?
@dlg99 thanks Andrey. While this situation can indeed happen (specially inducing failure injection at the network level), my understanding is that that would happen to any version of Bookkeeper, right? However, our observation is that when executing this kind of tests, only Pravega using Bookkeeper 4.15 client gets into this situation where ledgers supposed to be there are not; when using Bookkeeper 4.14.2, the issue does not reproduce. Note that we have done this test many times with Bookkeeper 4.14.2, while the few times we executed that with Bookkeeper 4.15 client we always got into that situation.
We are currently executing the same test using Bookkeeper client 4.14.3, to validate the hypothesis of @zymap https://github.com/apache/bookkeeper/issues/3466#issuecomment-1251816693
@RaulGracia My thought process was along the lines of updated ZK client making the problem easier to repro if that's ZK related. I experimented locally and tried to inject errors in unit tests but had no luck.
FYI @zymap @eolivelli @dlg99 we have performed the same test again with Bookkeeper 4.14.3 client as suggested by @zymap, but we have not been able to reproduce the issue this time. So, at the moment, with the evidences we have we cannot say that #2794 is the root cause of the problem.
@RaulGracia I am trying to wrap my head around how https://github.com/apache/bookkeeper/pull/2794 could result in a missing ledger handle.
One thing off with the PR is that it calls blocking lh.close() from the callback running on the main threadpool. AFAICT there are no deadlocks because whatever else happens runs on the ZK. Zk can be quite slow because of injected network errors. So here is the proposed change: https://github.com/apache/bookkeeper/pull/3513 (waiting for the tests to finish up)
Can the lost ledger handle might be caused by the way Pravega handles errors in ledger open?
Thanks a lot @dlg99, I can certainly review #3513. But after the latest experiments we did with Bookkeeper 4.14.3, which includes #2794, we couldn't reproduce the issue, so I'm not sure that that PR is contributing to the issue described here. In any case, it is independent from the value that #3513 brings by itself.
Can the lost ledger handle might be caused by the way Pravega handles errors in ledger open?
The point is that the very same Pravega code has no issues using Bookkeeper 4.14.2 client and injecting network failures, and we consistently get issues like the one describe here when upgrading to Bookkeeper 4.15 client. For this reason, I thought that the problem may be related to some change in the Bookkeeper client within branch-4.15
.
I think that to discard that the issue is present in the whole branch-4.14
, we will reproduce fault injection tests with latest 4.14 release; if the issue is cannot be reproduced, we know that branch-4.14
is fine in this regard (and we do not need to inspect cherry picked PRs). If the issue is reproduced, the number of commits to analyze would be much smaller.
I am working on this. I am preparing a set of Pravega docker images to pass to Dell-EMC Quality Eng team who can run the tests.
The. docker images contain a build of Pravega compatible with BK 4.15 (thanks to @RaulGracia ) but with the BK client compiled at different points in the GIT history of 4.15.0.
This way we will see which commit broke the compatibility.
It will take much time because those system tests take at least 24h before failing
To wrap up this issue: We have done fault injection tests of Pravega and several Bookkeeper client versions and we have not been able to reproduce the original issue (see details here). We have also upgraded Pravega master
(future 0.14 release) to Bookkeeper 4.15.3. If the issue is reproduced again, we will reopen the issue.
Hi all, I think we could have another reproduction of this problem, this time using Bookkeeper 4.14.2 client/server (therefore, the issue reported here may be present in other versions of Bookkeeper, not only in Bookkeeper 4.15).
This is the information I have been able to recollect so far (the missing ledger this time is 3538
):
- Ledger is created correctly:
2022-12-05 15:21:11,541 8491908 [core-23] INFO i.p.s.s.i.bookkeeper.BookKeeperLog - Log[29]: Created Ledger 3538.
- Pravega can write and process properly up to entry
52
:
2022-12-05 15:26:53,197 8833564 [BookKeeperClientWorker-OrderedExecutor-2-0] DEBUG o.a.bookkeeper.client.PendingAddOp - Submit callback (lid:3538, eid: 52). rc:0
2022-12-05 15:26:53,197 8833564 [core-17] DEBUG i.p.s.s.logs.OperationProcessor - OperationProcessor[29]: CommitSuccess (TxnId = 51, LastFullySerializedSN = 505834, LastStartedSN = 505834, Address = Sequence = 17179869235, LedgerId = 3538, EntryId = 51, Length = 35244).
2022-12-05 15:26:53,197 8833564 [core-28] DEBUG i.p.s.s.logs.OperationProcessor - OperationProcessor[29]: CommitSuccess (TxnId = 52, LastFullySerializedSN = 506120, LastStartedSN = 506120, Address = Sequence = 17179869236, LedgerId = 3538, EntryId = 52, Length = 783197).
- Post that, we see that, probably due to the network fault-injection test induced, there is a ensamble change that impacts entry
53
onwards for that ledger:
2022-12-05 15:27:07,801 8848168 [BookKeeperClientWorker-OrderedExecutor-2-0] DEBUG o.a.bookkeeper.client.LedgerHandle - [EnsembleChange(ledger:3538, change-id:0000000001)][attempt:1] changing ensemble from: [bookkeeper-bookie-2-13057, bookkeeper-bookie-1-21445] to: [bookkeeper-bookie-2-13057, bookkeeper-bookie-0-27781] starting at entry: 53
- We keep getting some errors and acknowledgements writing entries from
53
onwards, until we get this error:
2022-12-05 15:27:15,431 8855798 [ZKC-connect-executor-0-EventThread] ERROR o.a.b.client.MetadataUpdateLoop - UpdateLoop(ledgerId=3538,loopId=1ec0500f) Error writing metadata to store
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /pravega/pravega/bookkeeper/ledgers/00/0000/L3538
2022-12-05 15:27:15,432 8855799 [BookKeeperClientWorker-OrderedExecutor-2-0] WARN o.a.bookkeeper.client.LedgerHandle - [EnsembleChange(ledger:3538, change-id:0000000001)][attempt:1] Exception changing ensemble
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /pravega/pravega/bookkeeper/ledgers/00/0000/L3538
2022-12-05 15:27:15,433 8855800 [BookKeeperClientWorker-OrderedExecutor-2-0] ERROR o.a.bookkeeper.client.LedgerHandle - Closing ledger 3538 due to ZKException: Error while using ZooKeeper
- After that, the zNode for that very same ledger
3538
seems to be deleted:
2022-12-05 15:27:19,429 8859796 [ZKC-connect-executor-0-SendThread(zookeeper-client:2181)] DEBUG org.apache.zookeeper.ClientCnxn - Got WatchedEvent state:SyncConnected type:NodeDeleted path:/pravega/pravega/bookkeeper/ledgers/00/0000/L3538 for session id 0x3004cfb266c0003
2022-12-05 15:27:19,429 8859796 [ZKC-connect-executor-0-EventThread] DEBUG o.a.b.meta.AbstractZkLedgerManager - Received watched event WatchedEvent state:SyncConnected type:NodeDeleted path:/pravega/pravega/bookkeeper/ledgers/00/0000/L3538 from zookeeper based ledger manager.
2022-12-05 15:27:19,429 8859796 [ZKC-connect-executor-0-EventThread] DEBUG o.a.b.meta.AbstractZkLedgerManager - No ledger metadata listeners to remove from ledger 3538 after it's deleted.
- And it seems deleted because, after that, we just see
NoNodeException
when we try to access the Zookeeper metadata of this ledger:
2022-12-05 15:27:23,218 8863585 [ZKC-connect-executor-0-EventThread] DEBUG o.a.b.meta.AbstractZkLedgerManager - No such ledger: 3538
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /pravega/pravega/bookkeeper/ledgers/00/0000/L3538
2022-12-05 15:27:23,218 8863585 [ZKC-connect-executor-0-EventThread] ERROR o.a.b.client.MetadataUpdateLoop - UpdateLoop(ledgerId=3538,loopId=4e9e5246) Failed to read metadata from store
2022-12-05 15:27:23,219 8863586 [core-19] ERROR i.p.s.s.i.bookkeeper.BookKeeperLog - Log[29]: Unable to close LedgerHandle for Ledger 3538.
io.pravega.segmentstore.storage.DurableDataLogException: Unable to close ledger 3538.
Once the issue was reported, I tried to dig a little more with the Pravega and Bookkeeper CLIs:
- From Pravega side, if we inspect the
BookkeeperLog[29]
, we clearly see that ledger3538
is not accessible:
> bk details 29
{
"key": "log_summary",
"value": {
"logId": 29,
"epoch": 9,
"version": 511,
"enabled": true,
"ledgers": 4,
"truncation": "Sequence = 12884902323, LedgerId = 3491, EntryId = 435"
}
}
{
"key": "ledger_details",
"value": {
"ledger": 3491,
"seq": 3,
"status": "NotEmpty",
"lac": 971,
"length": 0,
"numBookies": 3,
"numFragments": 2,
"ensembleSize": 2,
"writeQuorumSize": 2,
"ackQuorumSize": 2,
"ensembles": "0: [bookkeeper-bookie-0-27781,bookkeeper-bookie-1-21445],972: [bookkeeper-bookie-3-9007,bookkeeper-bookie-1-21445]"
}
}
Exception executing BK details command: Unable to open-read ledger 3538.
Ledger 3538: Seq = 4, Status = NotEmpty. BK: Unable to open-read ledger 3538.
{
"key": "ledger_details",
"value": {
"ledger": 3720,
"seq": 8,
"status": "Empty",
"lac": -1,
"length": 0,
"numBookies": 2,
"numFragments": 1,
"ensembleSize": 2,
"writeQuorumSize": 2,
"ackQuorumSize": 2,
"ensembles": "0: [bookkeeper-bookie-2-13057,bookkeeper-bookie-3-9007]"
}
}
{
"key": "ledger_details",
"value": {
"ledger": 3738,
"seq": 9,
"status": "Unknown",
"lac": -1,
"length": 0,
"numBookies": 2,
"numFragments": 1,
"ensembleSize": 2,
"writeQuorumSize": 2,
"ackQuorumSize": 2,
"ensembles": "0: [bookkeeper-bookie-0-27781,bookkeeper-bookie-3-9007]"
}
}
- Using the Bookkeeper CLI, if we list all the ledgers in the system, we can see ledger
3538
:
bin/bookkeeper shell listledgers
...
17:39:21,474 INFO ledgerID: 3537
17:39:21,474 INFO ledgerID: 3538 <<<<<< HERE IT IS
17:39:21,474 INFO ledgerID: 3539
...
- However, it is not possible to get its contents or metadata:
===== LEDGER: 3538 =====
master key : 850bf1071c5e3d8c24235676f8816ae0cbe2f14f
size : 0
entries : 0
isFenced : true
===== LEDGER: 3538 =====
bin/bookkeeper shell readledger -ledgerid 3538
17:37:18,747 INFO Successfully connected to bookie: bookkeeper-bookie-1-21445 [id: 0x45c77aa8, L:/10.128.0.116:33870 - R:10.129.0.171/10.129.0.171:3181]
17:37:18,748 INFO connection [id: 0x45c77aa8, L:/10.128.0.116:33870 - R:10.129.0.171/10.129.0.171:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
17:37:18,755 ERROR Read of ledger entry failed: L3538 E0-E0, Sent to [bookkeeper-bookie-2-13057, bookkeeper-bookie-1-21445], Heard from [] : bitset = {}, Error = 'No such ledger exists on Bookies'. First unread entry is (-1, rc = null)
17:37:18,761 ERROR Error reading entry 0 from ledger 3538
org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException: No such ledger exists on Bookies
at org.apache.bookkeeper.client.SyncCallbackUtils.finish(SyncCallbackUtils.java:83) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-build-223.jar:4.14.3-build-223]
bin/bookkeeper shell ledgermetadata -ledgerid 3538
17:42:00,323 WARN An exception was thrown while closing send thread for session 0x2004d6096890377.
org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read additional data from server sessionid 0x2004d6096890377, likely server has closed socket
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77) ~[org.apache.zookeeper-zookeeper-3.6.2.jar:3.6.2]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) ~[org.apache.zookeeper-zookeeper-3.6.2.jar:3.6.2]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1275) [org.apache.zookeeper-zookeeper-3.6.2.jar:3.6.2]
Exception in thread "main" com.google.common.util.concurrent.UncheckedExecutionException: java.util.concurrent.ExecutionException: org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsOnMetadataServerException: No such ledger exists on Metadata Server
How inducing delays at the network level in the pods leads very rarely to this situation is something that I still don't know. I'm also attaching the relevant log lines from Pravega Segment Store for ledger 3538
(ledger-3538.log).