herddb icon indicating copy to clipboard operation
herddb copied to clipboard

Failed to restart cluster with old "untouched" tablespaces

Open diegosalvi opened this issue 2 years ago • 2 comments

BUG REPORT

  • Please describe the issue you observed:

I've a long running cluster with many tablespaces. Some of them aren't really used (no transactions, no datachange). They are checkpointed every 15 minutes with the same log position (no new changes). After a reboot unused unchanged tablespaces couldn't boot such tablespaces anymore, struck in a recovery/fail loop

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: Actual ledgers list:LedgersInfo{activeLedgers=[4072702], firstLedger=374, zkVersion=51} tableSpace q73 (e415f271141e4420a622eeade68b30d1)

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: recovery from latest snapshotSequenceNumber:(4072702,2) tableSpace q73 (e415f271141e4420a622eeade68b30d1), node pulcino, fencing false

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog isRecoveryAvailable
INFO: Actual ledgers list:LedgersInfo{activeLedgers=[4072702], firstLedger=374, zkVersion=51} tableSpace q73 (e415f271141e4420a622eeade68b30d1)

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: Ledger 4072702: CLOSED {0=[rzv-es03-mass.rozzano.diennea.lan:3181]} created by pulcino, LastEntryId 145,768 Length 2,915,380

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: Tablespace q73 (e415f271141e4420a622eeade68b30d1), Recovering from ledger 4072702, first=2 lastAddConfirmed=145768 written by pulcino

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: q73 (e415f271141e4420a622eeade68b30d1) From entry 2, to entry 1,002 (0 %) read time 16 ms

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: q73 (e415f271141e4420a622eeade68b30d1) From entry 1,003, to entry 2,003 (0.687 %) read time 16 ms

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: q73 (e415f271141e4420a622eeade68b30d1) From entry 2,004, to entry 3,004 (1.373 %) read time 15 ms
  
21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: q73 (e415f271141e4420a622eeade68b30d1) From entry 3,005, to entry 4,005 (2.06 %) read time 15 ms

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: q73 (e415f271141e4420a622eeade68b30d1) From entry 4,006, to entry 5,006 (2.747 %) read time 15 ms

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
INFO: q73 (e415f271141e4420a622eeade68b30d1) From entry 5,007, to entry 6,007 (3.434 %) read time 15 ms

21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog recovery
SEVERE: Fatal error during recovery of q73 (e415f271141e4420a622eeade68b30d1)
org.apache.bookkeeper.client.BKException$BKReadException: Error while reading ledger
        at org.apache.bookkeeper.client.BKException.create(BKException.java:62)
        at org.apache.bookkeeper.client.PendingReadOp.submitCallback(PendingReadOp.java:640)
        at org.apache.bookkeeper.client.PendingReadOp$LedgerEntryRequest.fail(PendingReadOp.java:171)
        at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.sendNextRead(PendingReadOp.java:393)
        at org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.logErrorAndReattemptRead(PendingReadOp.java:436)
        at org.apache.bookkeeper.client.PendingReadOp.readEntryComplete(PendingReadOp.java:582)
        at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion$1.readEntryComplete(PerChannelBookieClient.java:1836)
        at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleReadResponse(PerChannelBookieClient.java:1917)
        at org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleV3Response(PerChannelBookieClient.java:1892)
        at org.apache.bookkeeper.proto.PerChannelBookieClient$3.safeRun(PerChannelBookieClient.java:1447)
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:834)


21-07-26-14-29-46       herddb.cluster.BookkeeperCommitLog Jul 26, 2021 2:29:46 PM herddb.cluster.BookkeeperCommitLog close
SEVERE: closed
  • What did you do? Fortunately in this case data wasn't needed anymore, i just drop manually tablespaces (just to avoid log pollution and log size increase)

  • What did you expect to see? The server boots the unchanged tablespaces

  • What did you see instead? The activator still attempted to boot such tablespaces, every time failing ad some point during "recovery" with a missing ledger.

Please take note that there was any need of any recovery: all data was fully checkpointed with no pending action or transactions in the log.

diegosalvi avatar Jul 28 '21 13:07 diegosalvi

I think we should avoid this issue at all... It make no sense to attempt a recovery from a very old position of time when really there is nothing to recover. We should "advance" log position on metadata to a recent position every time we make a checkpoint. With no new data we should try to advance to next "current" position (just we must be sure to not skip a concurrent incoming update/insert/delete)

diegosalvi avatar Jul 28 '21 14:07 diegosalvi

I cannot understand this problem, it should not happen by design and we should have test cases. which is the error on the bookie ? did you drop some bookie without "decommissioning" it ?

we are not deleting the "active ledgers", so the Leader is always able to recover. If a follower has an old snapshot then it downloads a snapshot from the Leader.

eolivelli avatar Sep 16 '21 05:09 eolivelli