neon
neon copied to clipboard
deal with wal records bigger than wal_seg_size
Got some context on this? Where do we not handle them correctly?
That was about find_end_of_wal()
in the safekeeper -- when it faces continuation record it tries to iterate backwards on WAL to reconstruct whole record and check CRC and assumes that record starts in that previous segment. IIUC there is a code to that fixes that behavior for cases when we need deeper look behind: https://github.com/zenithdb/zenith/commits/safe_flush_ptr That issue about extracting and testing that
cc @petuhovskiy , you've mentioned that you're testing this already, but the test hadn't failed :) can you share some more details? thanks!
cc @petuhovskiy , you've mentioned that you're testing this already, but the test hadn't failed :) can you share some more details? thanks!
https://github.com/zenithdb/zenith/commit/4468797a237ae500e735f702ffcb63acfca47a89 This is the commit with the test, that wasn't failing for me at the time, one can look at it to create failing test
I'm wondering if there is some bug in find_end_of_wal_segment
's logic here? In my test xlp_rem_len
in adjacent pages in a partial WAL record differ by either 8152 bytes (8192 - 40 = XLOG_BLCKSZ - XLOG_SIZE_OF_XLG_LONG_PHD
) 8168 bytes (8192 - 24 = XLOG_BLCKSZ - XLOG_SIZE_OF_XLG_SHORT_PHD
), not 8192 exactly. So it seems to me that xlp_rem_len
does not account for page headers, while the code assumes it does and it's safe to just offset inside the WAL file instead of re-reading all pages/blocks (not sure what's the right term here).
So my test does not actually need to make records bigger than wal_seg_size
, bigger than XLOG_BLCKSZ
is already enough.
Needs further investigation.
Progress so far:
- [x] Add some corner case unit tests for
find_end_of_wal
- [x] Correctly skip contrecords in the last segment
- [x] Handle WAL ending in a big multi-segment record. That requires reading previous segments (only the first few bytes of each, though) if we want to validate this last record.
- [x] Address bunch of review comments from #1574 (including my own, one can even craft a test case for it), they may reveal more bugs and corner cases. Maybe reuse of Pageserver's code is a good idea, unless it does lots of parsing which we want to avoid in Safekeeper.