fix IllegalArgumentException("inconsistent range") from ConcurrentSkipListSet
Motivation
When the content of LedgerMetadata is as follows:
Versioned(value=LedgerMetadata{formatVersion=3, ensembleSize=3, writeQuorumSize=3, ackQuorumSize=2, state=CLOSED, length=42, lastEntryId=1, digestType=CRC32C, password=base64:, ensembles={0=[10.167.101.44:3181, 10.145.144.76:3181, 10.145.136.51:3181], 1=[10.170.112.33:3181, 10.170.140.51:3181, 10.170.92.28:3181], 2=[10.171.7.2:3181, 10.172.180.82:3181, 10.172.149.89:3181]}, customMetadata={component=base64:bWFuYWdlZC1sZWRnZXI=, pulsar/managed-ledger=base64:cHVibGljL2RlZmF1bHQvcGVyc2lzdGVudC9kaWNoYXRfcHJvZF9ldmVudF9sb2ctcGFydGl0aW9uLTI=, pulsar/cursor=base64:Y2dfZGljaGF0X3Byb2RfZXZlbnRfbG9n, application=base64:cHVsc2Fy}}, version=13)
- this ledger is closed
- The firstEntryId of the last fragment of the ledger is 2. But the lastEntryId of the ledger is 1.
- All bookies of the first fragment of this ledger have been offline. Such as 10.167.101.44:3181,10.145.144.76:3181, 10.145.136.51:3181. Therefore, entry(entryId=0) reading will fail.
- A bookie of the last fragment has been offline, for example 10.171.7.2:3181.
Based on the above description, ReplicationWorker will replicate the first and last fragment. The tryReadingFaultyEntries function will be called before replicating.
boolean tryReadingFaultyEntries(LedgerHandle lh, LedgerFragment ledgerFragment)
After the first fragment replica fails, the fragment will be skipped. At this time, the value of unableToReadEntriesForReplication is <ledgerId=0, entryIdsUnableToRead=<0>>.
When replicating the last fragment, tryReadingFaultyEntries will throw an IllegalArgumentException("inconsistent range"). Which in turn causes the ReplicationWorker process to exit.
The log is as follows:
2025-02-05 20:16:16,041 [ DEBUG ] ReplicationWorker - Founds fragments [Fragment(LedgerID: 449738, FirstEntryID: 0[0], LastKnownEntryID: 0[0], Host: [10.145.136.51:3181, 10.167.101.44:3181, 10.145.144.76:3181], Closed: true), Fragment(LedgerID: 449738, FirstEntryID: 1[1], LastKnownEntryID: 1[1], Host: [10.170.92.28:3181, 10.170.112.33:3181, 10.170.140.51:3181], Closed: true), Fragment(LedgerID: 449738, FirstEntryID: 2[-1], LastKnownEntryID: 1[-1], Host: [10.171.7.2:3181, 10.172.149.89:3181, 10.172.180.82:3181], Closed: true)] for replication from ledger: 449738
From the log, we can see that the FirstEntryID of the last fragment is greater than the LastKnownEntryID, which will cause the subSet function of the ConcurrentSkipListSet class to throw an IllegalArgumentException.
The exception log is as follows:
2025-02-05 16:51:23,786 [ ERROR ] BookieThread - Uncaught exception in thread ReplicationWorker java.lang.IllegalArgumentException: inconsistent range at java.util.concurrent.ConcurrentSkipListMap$SubMap.<init>(ConcurrentSkipListMap.java:2404) ~[?:?] at java.util.concurrent.ConcurrentSkipListMap.subMap(ConcurrentSkipListMap.java:1884) ~[?:?] at java.util.concurrent.ConcurrentSkipListSet.subSet(ConcurrentSkipListSet.java:416) ~[?:?] at org.apache.bookkeeper.replication.ReplicationWorker.tryReadingFaultyEntries(ReplicationWorker.java:316)
Changes
When the fragment's FirstEntryID is greater than LastKnownEntryID, tryReadingFaultyEntries directly returns true.
Can we fix the wrong fragment in getUnderreplicatedFragments instead of guardian don't know how they showed up.
Can we fix the wrong fragment in
getUnderreplicatedFragmentsinstead of guardian don't know how they showed up.
getUnderreplicatedFragments returns this fragment(index=2). When the admin.replicateLedgerFragment method processes this fragment(index=2), it will directly skip and try to repair the fragment's ensemble. When the ledger is replicated again ,getUnderreplicatedFragments will not return this fragment(index=2) because the fragment no longer contains offline bookies.
Can we fix the wrong fragment in
getUnderreplicatedFragmentsinstead of guardian don't know how they showed up.
Fragment(index=2) should be successfully replicated even if fragment(index=0) has not been successfully replicated.