bookkeeper icon indicating copy to clipboard operation
bookkeeper copied to clipboard

Bookie translate to readonly mode due to NullPointerException while flushing mem table.

Open zyj5340 opened this issue 1 year ago • 0 comments

BUG REPORT When flushing mem table, the EntryMemTable.flush() method throws an NullPointerException and it cause the Bookie translate into readonly mode, and never recover to R/W mode. Describe the bug While processing FileInfo.readAbsolute() method , it passed the first check (fc == null) but got NullPointerException while doing fc.read(bb.start), we think the fc may be closed during two synchronized lock, and that cause the problem.

ERROR LOG:

2024-04-04 04:26:10,349Z [SortedLedgerStorage-0] ERROR org.apache.bookkeeper.bookie.SortedLedgerStorage - Exception thrown while flushing skip list cache.
java.lang.NullPointerException: Cannot invoke "java.nio.channels.FileChannel.read(java.nio.ByteBuffer, long)" because "this.fc" is null
        at org.apache.bookkeeper.bookie.FileInfo.readAbsolute(FileInfo.java:426) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.FileInfo.read(FileInfo.java:396) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.LedgerEntryPage.readPage(LedgerEntryPage.java:196) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.IndexPersistenceMgr.updatePage(IndexPersistenceMgr.java:650) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.IndexInMemPageMgr.grabLedgerEntryPage(IndexInMemPageMgr.java:449) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.IndexInMemPageMgr.getLedgerEntryPage(IndexInMemPageMgr.java:414) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.IndexInMemPageMgr.putEntryOffset(IndexInMemPageMgr.java:573) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.LedgerCacheImpl.putEntryOffset(LedgerCacheImpl.java:107) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.InterleavedLedgerStorage.processEntry(InterleavedLedgerStorage.java:539) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.SortedLedgerStorage.process(SortedLedgerStorage.java:288) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.EntryMemTable.flushSnapshot(EntryMemTable.java:255) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
        at org.apache.bookkeeper.bookie.EntryMemTable.flush(EntryMemTable.java:205) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
       timed out waiting for input: auto-logoutgerStorage.java:317) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3]
Connection closed.il.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
        at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]

To Reproduce It difficult to reproduce, happen occasionally.

Additional context We want to add a second check if the fc is null in second synchronized, and if the fc equals null ,we clear the ByteBuffer and return 0. However, we don't know whether this change will cause another problem or is there a better way to fix this bug. org.apache.bookkeeper.bookie.FileInfo#readAbsolute

/**
     * Read data from position <i>start</i> to fill the byte buffer <i>bb</i>.
     * If <i>bestEffort </i> is provided, it would return when it reaches EOF.
     * Otherwise, it would throw {@link org.apache.bookkeeper.bookie.ShortReadException}
     * if it reaches EOF.
     *
     * @param bb
     *          byte buffer of data
     * @param start
     *          start position to read data
     * @param bestEffort
     *          flag indicates if it is a best-effort read
     * @return number of bytes read
     * @throws IOException
     */
    private int readAbsolute(ByteBuffer bb, long start, boolean bestEffort)
            throws IOException {
        checkOpen(false);
        synchronized (this) {
            if (fc == null) {
                return 0;
            }
        }
        int total = 0;
        int rc = 0;
        while (bb.remaining() > 0) {
            synchronized (this) {
                // fix code start
                if (null == fc){
                    bb.clear();
                    return 0;
                }
                // fix code end
                rc = fc.read(bb, start);
            }
            if (rc <= 0) {
                if (bestEffort) {
                    return total;
                } else {
                    throw new ShortReadException("Short read at " + getLf().getPath() + "@" + start);
                }
            }
            total += rc;
            // should move read position
            start += rc;
        }
        return total;
    }

zyj5340 avatar Apr 09 '24 08:04 zyj5340