server MDEV-34515: Contention between secondary index UPDATE and purge due to large innodb_purge_batch

trafficstars

[x] The Jira issue number for this PR is: MDEV-34515, MDEV-34520

Description

Various performance deficiencies were found in the purge and MVCC subsystem while analyzing a regression of Sysbench oltp_update_index when using --tables=1 --table_size=1. In the end, the contention was fixed by setting innodb_purge_batch_size=128, which coincides with the number of persistent rollback segments that are available for transactions.

We will change the default value to innodb_purge_batch_size=127 in order to remain compatible with innodb_undo_tablespaces>1. When dedicated undo tablespaces are enabled, the first rollback segment will be disabled and only 127 rollback segments will be available.

In addition, we will fix a number of lower-level performance problems that were identified during the analysis. These will hopefully avoid a regression due to the reduced innodb_purge_batch_size in any workload.

Release Notes

To avoid a performance anomaly on workloads that UPDATE secondary indexes, the default value of innodb_purge_batch_size will be reduced from 1000 to 127. Also, the perfomance of the purge of committed history was improved.

How can this PR be tested?

sysbench as noted in MDEV-34515

Basing the PR against the correct MariaDB version

[ ] This is a new feature or a refactoring, and the PR is based against the latest MariaDB development branch.
[x] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

[x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
[ ] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

Aug 02 '24 14:08 dr-m

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Aug 02 '24 14:08 CLAassistant

There is a consistent failure of one test on one builder:

innodb.innodb_defrag_concurrent          w35 [ fail ]
        Test ended at 2024-08-12 14:46:16
CURRENT_TEST: innodb.innodb_defrag_concurrent
--- /home/buildbot/amd64-debian-11-msan/build/mysql-test/suite/innodb/r/innodb_defrag_concurrent.result	2024-08-12 13:55:13.000000000 +0000
+++ /home/buildbot/amd64-debian-11-msan/build/mysql-test/suite/innodb/r/innodb_defrag_concurrent.reject	2024-08-12 14:46:15.834980248 +0000
@@ -77,6 +77,7 @@
 test.t1	optimize	status	OK
 check table t1 extended;
 Table	Op	Msg_type	Msg_text
+test.t1	check	Warning	InnoDB: Clustered index record not found for index `third` of table `test`.`t1`: COMPACT RECORD(info_bits=32, 2 fields): {[4]    (0x800007D0),[4]  FQ(0x80004651)}
 test.t1	check	status	OK
 select count(*) from t1;
 count(*)
Result length mismatch

All 3 failures are similar: https://buildbot.mariadb.org/#/builders/572/builds/11878/steps/7/logs/stdio https://buildbot.mariadb.org/#/builders/572/builds/11896/steps/7/logs/stdio https://buildbot.mariadb.org/#/builders/572/builds/11900/steps/7/logs/stdio

I think that we can safely ignore this failure. The defragmentation (mis)feature was removed in 7ca89af6f8faf1f8ec6ede01a9353ac499d37711 in a later major release, and the error is benign: it is complaining that a delete-marked record in a secondary index is missing a counterpart in the clustered index. For any delete-marked record in a secondary index, InnoDB would always look up the primary key in the clustered index and be prepared for no match. So, there shouldn’t be any correctness issue, only a performance issue.

Aug 13 '24 13:08 dr-m

On one system, I got the following tests failing on a CMAKE_BUILD_TYPE=Debug build.

gcol.innodb_virtual_basic gcol.innodb_virtual_index gcol.gcol_rollback

The tests pass with the parent of e9e083bd8e78db23a54a0a5ab2d271eb86c444c7 or with that and the following patch:

diff --git a/storage/innobase/row/row0vers.cc b/storage/innobase/row/row0vers.cc
index 6b1fe87630e..27132bdeb8d 100644
--- a/storage/innobase/row/row0vers.cc
+++ b/storage/innobase/row/row0vers.cc
@@ -528,10 +528,6 @@ row_vers_build_cur_vrow_low(
 			 = DATA_MISSING;
 	}
 
-	ut_ad(mtr->memo_contains_page_flagged(rec,
-					      MTR_MEMO_PAGE_S_FIX
-					      | MTR_MEMO_PAGE_X_FIX));
-
 	version = rec;
 
 	/* If this is called by purge thread, set TRX_UNDO_PREV_IN_PURGE

I will investigate later.

Aug 23 '24 14:08 dr-m

There is a consistent failure of one test on one builder:
innodb.innodb_defrag_concurrent          w35 [ fail ]
        Test ended at 2024-08-12 14:46:16
CURRENT_TEST: innodb.innodb_defrag_concurrent
--- /home/buildbot/amd64-debian-11-msan/build/mysql-test/suite/innodb/r/innodb_defrag_concurrent.result	2024-08-12 13:55:13.000000000 +0000
+++ /home/buildbot/amd64-debian-11-msan/build/mysql-test/suite/innodb/r/innodb_defrag_concurrent.reject	2024-08-12 14:46:15.834980248 +0000
@@ -77,6 +77,7 @@
 test.t1	optimize	status	OK
 check table t1 extended;
 Table	Op	Msg_type	Msg_text
+test.t1	check	Warning	InnoDB: Clustered index record not found for index `third` of table `test`.`t1`: COMPACT RECORD(info_bits=32, 2 fields): {[4]    (0x800007D0),[4]  FQ(0x80004651)}
 test.t1	check	status	OK
 select count(*) from t1;
 count(*)
Result length mismatch
All 3 failures are similar: https://buildbot.mariadb.org/#/builders/572/builds/11878/steps/7/logs/stdio https://buildbot.mariadb.org/#/builders/572/builds/11896/steps/7/logs/stdio https://buildbot.mariadb.org/#/builders/572/builds/11900/steps/7/logs/stdio

I think that we can safely ignore this failure.

Because I can reproduce the failure rather easily on my local system without using MemorySanitizer, I took a deeper look at this. The problem is not reproducible when executing a normal OPTIMIZE TABLE (without defragmentation). Because a variant of the problem is reproducible after removing the OPTIMIZE TABLE, this should be a simple case of MDEV-29823:

CURRENT_TEST: innodb.innodb_defrag_concurrent
--- /mariadb/10.6/mysql-test/suite/innodb/r/innodb_defrag_concurrent.result	2024-08-27 10:34:10.711965386 +0300
+++ /mariadb/10.6/mysql-test/suite/innodb/r/innodb_defrag_concurrent.reject	2024-08-27 10:34:59.188564498 +0300
@@ -74,6 +74,7 @@
 disconnect con4;
 check table t1 extended;
 Table	Op	Msg_type	Msg_text
+test.t1	check	Warning	InnoDB: Unpurged clustered index record in table `test`.`t1`: COMPACT RECORD(info_bits=32, 8 fields): {[4]  % (0x80002516),[6]     ;(0x00000000F33B),[7]/    / (0x2F000001C52FBA),[256]AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA(0x414141414141414141414141414141414141414141414141414141414141414141414
 test.t1	check	status	OK
 select count(*) from t1;
 count(*)

Result length mismatch

48becffd072fd90f53e4d61c02f665a7c5b58f32 addresses this by removing the CHECK TABLE attribute EXTENDED, so that there will be no complaints about orphan records.

Aug 27 '24 05:08 dr-m

server
server copied to clipboard

MDEV-34515: Contention between secondary index UPDATE and purge due to large innodb_purge_batch_size

Description

Release Notes

How can this PR be tested?

Basing the PR against the correct MariaDB version

PR quality check

server server copied to clipboard

MDEV-34515: Contention between secondary index UPDATE and purge due to large innodb_purge_batch_size

Description

Release Notes

How can this PR be tested?

Basing the PR against the correct MariaDB version

PR quality check

server
server copied to clipboard