percona-server
percona-server copied to clipboard
PS-9144 : Missing rows after ALGORITHM=INPLACE ALTER under same workl…
…oad as PS-9092
https://perconadev.atlassian.net/browse/PS-9144
Problem:
ALTER TABLE which rebuilds InnoDB table using INPLACE algorithm might sometimes lead to row loss if concurrent purge happens on the table being ALTERed.
Analysis:
New implementation of parallel ALTER TABLE INPLACE in InnoDB was introduced in MySQL 8.0.27. Its code is used for online table rebuild even in a single-thread case. This implementation iterates over all the rows in the table, in general case, handling different subtrees of a B-tree in different threads. This iteration over table rows needs to be paused, from time to time, to commit InnoDB MTR/ release page latches it holds. This is necessary to give a way to concurrent actions on the B-tree scanned or before flushing rows of new version of table from in-memory buffer to the B-tree. In order to resume iteration after such pause persistent cursor position saved before pause is restored.
The cause of the problem described above lies in PCursor::restore_position() method. This method used for two purposes in the code:
- To resume iteration after it was paused in the scenario described above.
- To initialize cursor when a thread starts iteration through subtree it was assigned to process.
In scenario 2) we restore cursor to the record which has not been processed yet. If the record, which position was saved originally when subtrees/ranges to process were assigned to threads, has been purged meanwhile, the cursor will be restored to preceding record. And the cursor needs to be moved to the next record, so we don't start processing our subtree from the record belonging to a different thread. PCursor::restore_position() handles this by detecting situation when saved record was purged and moving to the next record in this case.
However, in scenario 1) we actually restore cursor to the record which has been processed already and from which will be doing step to the next record right after restore. So iterating to the next record if saved record was purged like it is done in PCursor::restore_position(), and which is necessary in case 2), leads to double step forward, resulting in our scan missing record!
Fix:
This patch solves the problem by using different logic for restore of cursor position in these two cases.
For case 1) we simply restore position which was saved using btr_pcur_t::restore_position(). If the record to which cursor is supposed to point has been purged meanwhile, this method will point the cursor to preceding record. Then the calling code will iterate to next record (i.e. successor of purged record) after restore.
For case 2) we keep pre-fix behavior and correct cursor position after restoring if the record which position has been saved originally has been purged by moving to the next record in subtree to be processed. PCursor::restore_position() method which implements handling of this case has been renamed to PCursor::restore_position_for_range() and greately simplified.