server
server copied to clipboard
MDEV-34830: LSN in the future is not being treated as serious corruption
- [x] The Jira issue number for this PR is: MDEV-34830
Description
The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written.
On crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery=6 can be used.
recv_sys_t::max_page_lsn: Replaces recv_max_page_lsn.
recv_sys_t::early_batch: Whether apply(false) is executing. Before the final recovery batch, we will not have read the log records until the end and therefore will not know the final LSN.
recv_lsn_checks_on: Remove.
recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit 762bcb81b5bf9bbde61fed59afb26417f4ce1e86 the dblwr=true copies of pages may legitimately be "too new".
recv_sys_t::check_page_lsn(): Validate FIL_PAGE_LSN during recovery. Update max_page_lsn if needed. Do not flag an error if early_batch.
recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is large enough for recovery. Invoke
recv_sys_t::check_page_lsn() on the chosen LSN so that "LSN in the future" can be flagged.
buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds.
buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.
buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future.
Release Notes
InnoDB crash recovery was sloppy and would wrongly start up if the ib_logfile0 is corrupted and could not be parsed to a far enough log sequence number. Instead, it could incur further corruption while writing "log sequence number is in the future" messages into the server error log. Also, some glitches in the InnoDB doublewrite buffer recovery were fixed.
The fix will implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery=6 can be used.
How can this PR be tested?
./mtr --parallel=auto innodb.doublewrite innodb.corrupted_during_recovery
I had previously tested this with innodb.log_file_size_online on #3458 with an injected error:
diff --git a/storage/innobase/log/log0log.cc b/storage/innobase/log/log0log.cc
index d7aae556ce0..2433a119a86 100644
--- a/storage/innobase/log/log0log.cc
+++ b/storage/innobase/log/log0log.cc
@@ -808,7 +808,7 @@ void log_t::resize_write_buf(const byte *b, size_t length) noexcept
}
ut_a(os_file_write_func(IORequestWrite, "ib_logfile101", resize_log.m_file,
- b, offset, length) == DB_SUCCESS);
+ resize_buf, offset, length) == DB_SUCCESS);
}
/** Write buf to ib_logfile0.
The test would occasionally fail like this:
2024-08-29 14:05:14 0 [ERROR] InnoDB: The log was only scanned up to 7208383, while the current LSN at the time of the latest checkpoint 7208383 was 7413115 and the maximum LSN on a data page was 0!
Basing the PR against the correct MariaDB version
- [ ] This is a new feature or a refactoring, and the PR is based against the
mainbranch. - [ ] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.
This is fixing a bug that affects 10.5 as well, but only starting with 10.6 0b47c126e31cddda1e94588799599e138400bcf8 we have robust handling of corrupted pages in place.
PR quality check
- [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
- [ ] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.