server
server copied to clipboard
MDEV-33894: Resurrect innodb_log_write_ahead_size
- [x] The Jira issue number for this PR is: MDEV-33894
Description
In commit 685d958e38b825ad9829be311f26729cccf37c46 (MDEV-14425) we assumed that the physical block size (typically 512 or 4096 bytes) would be an optimal ib_logfile0 write size. In fact, some file systems and block devices may benefit from using much larger writes, for example due to transparent compression.
Release Notes
We will resurrect a parameter, so that read-on-write to the ib_logfile0 can be avoided:
SET GLOBAL innodb_log_write_ahead_size=1048576;
This parameter must be a power of 2 between the physical block size and innodb_log_buffer_size, and innodb_log_file_size must be an integer multiple of it. Any attempt to set an incorrect value will be rejected with an error. On SET GLOBAL innodb_log_file_size (log resizing), innodb_log_write_ahead_size may silently be set to a smaller value.
How can this PR be tested?
./mtr innodb.log_file_size_online
On many 64-bit Linux platforms, you should ensure that the test is not run on /dev/shm, so that pread and pwrite based log access will be used instead of “fake PMEM”. Another tmpfs or shmfs location is fine:
rm -fr var
mkdir /run/user/"$UID"/var
ln -s /run/user/"$UID"/var .
./mtr innodb.log_file_size_online
Alternatively, this can be tested on /dev/shm when building cmake -DWITH_INNODB_PMEM=OFF.
Some stress testing with SET GLOBAL of innodb_log_file_size and innodb_log_write_ahead_size would be very useful. The server should run a write heavy workload and be killed and restarted.
Additionally, mariadb-backup --backup should be tested while SET GLOBAL innodb_log_write_ahead_size is being executed. (Remember that backup is expected to hang if SET GLOBAL innodb_log_file_size is executed, because it would fail to switch to track the resized log file.)
All testing should be conducted while the InnoDB redo log interface is not memory-mapped.
Basing the PR against the correct MariaDB version
- [ ] This is a new feature and the PR is based against the latest MariaDB development branch.
- [x] This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced.
PR quality check
- [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
- [ ] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.
The experiment ef96260d56853357a97e93cde0bee393a619b4fc seems to break things massively.
While configuring a new MariaDB instance running on ZFS I have stumbled over the parameter innodb-log-write-ahead-size.
According to this guide, we should set the parameter to match the recordsize of the zfs dev (128k in our case):
innodb_log_write_ahead_size=16384: In order to prevent torn pages and avoid read-on-write overhead, we should set this to the underlying file system block size. If we are keeping the InnoDB logs in the data directory (as is the default), we should set this to what we set the ZFS recordsize to.
I recognize the current max value of the parameter in 10.11.9-MariaDB-deb12-log is 4096.
As I think their recommendation makes sense it would be great if larger sizes (128k+) could be supported in future releases.