MDEV-34431: Avoid spin loops on page I/O waits
- [x] The Jira issue number for this PR is: MDEV-34431
Description
Recent performance tests suggest that spin loops on index_lock and block_lock are hurting performance, especially in cases when progress is blocked by an I/O wait. The simplest case would be that the waited-for block is io-fixed (being read into the buffer pool or written back to data files). A more complex case is that a thread that is waiting for I/O on another block or on the log file is holding a conflicting latch on an index tree or a buffer page.
Disabling the spin loops altogether for index_lock and block_lock seems to yield overall the best result.
The best scalability was observed with thread_handling=pool-of-threads which defaults to creating my_getncpus() connection handler threads.
Note: For latches that are not being held while waiting for any I/O, such as trx_t::mutex, spin loops may still be helpful. In fact, we tried disabling all spin loops altogether, and got worse performance.
On Microsoft Windows, we will ignore the parameter innodb_sync_spin_loops and rely on the built-in user space spin loop logic in SRWLOCK and WaitOnAddress().
ssux_lock_impl<false>::rd_lock(): Jump straight to rd_wait() without one step of spinning.
ssux_lock_impl<false>::rd_wait(): Invoke rd_lock_try() before entering the wait.
Release Notes
Spin loops on waits for InnoDB index tree or buffer page latches have been disabled, because they are often a waste of CPU when running I/O bound workloads.
On Microsoft Windows, the parameter innodb_sync_spin_loops will be ignored.
How can this PR be tested?
Sysbench or HammerDB on a workload that is larger than the buffer pool. This should be most prominent when there are synchronous waits for pages in the buffer pool.
Measuring the CPU usage while running the test case in MDEV-32067 should show some impact.
Basing the PR against the correct MariaDB version
- [ ] This is a new feature or a refactoring, and the PR is based against the
mainbranch. - [x] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.
PR quality check
- [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
- [ ] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
I filed #4172 for a 11.4 version of this, comprising this change as well as a second change, which I originally developed and tested on 10.6 as d2f77d68a9558a263cb03acce8cb5edeb9675964. I think that only the first change has an acceptable level of risk for the 10.6 release. The 11.4 version of the latter change is simpler, thanks to some earlier code removal and cleanup.
I suspect that the reason for timeouts on AppVeyor is 9d34322052682664e0400ae8f12f1f713f219393, which @vaintroub suggested. We would seem to need a more sophisticated fix to disable the additional spin loop layer on Microsoft Windows. For instance, the following template specializations would need to skip the for loops altogether on _WIN32 if the AcquireSRWLockShared() and AcquireSRWLockExclusive() include spin loop logic on their own:
#if defined _WIN32 || defined SUX_LOCK_GENERIC
template<> void srw_lock_<true>::rd_wait() noexcept
{
const unsigned delay= srw_pause_delay();
for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
{
srw_pause(delay);
if (rd_lock_try())
return;
}
IF_WIN(AcquireSRWLockShared(&lk), rw_rdlock(&lk));
}
template<> void srw_lock_<true>::wr_wait() noexcept
{
const unsigned delay= srw_pause_delay();
for (auto spin= srv_n_spin_wait_rounds; spin; spin--)
{
srw_pause(delay);
if (wr_lock_try())
return;
}
IF_WIN(AcquireSRWLockExclusive(&lk), rw_wrlock(&lk));
}
#endif
https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435045 and https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435868 (for the revert 87e2123ee42e6be2b5ec5e778c7f1f433f4b2051) prove that @vaintroub’s Windows specific patch indeed has a problem. Several tests were timing out (240 seconds) when the patch is present.
https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435045 and https://ci.appveyor.com/project/RasmusJohansson/server/builds/52435868 (for the revert 87e2123) prove that @vaintroub’s Windows specific patch indeed has a problem. Several tests were timing out (240 seconds) when the patch is present.
There is a hidden endless loop in current existing code, in srw_mutex_impl