server icon indicating copy to clipboard operation
server copied to clipboard

MDEV-33894: Resurrect innodb_log_write_ahead_size

Open dr-m opened this issue 1 year ago • 2 comments

  • [x] The Jira issue number for this PR is: MDEV-33894

Description

In commit 685d958e38b825ad9829be311f26729cccf37c46 (MDEV-14425) we assumed that the physical block size (typically 512 or 4096 bytes) would be an optimal ib_logfile0 write size. In fact, some file systems and block devices may benefit from using much larger writes, for example due to transparent compression.

Release Notes

We will resurrect a parameter, so that read-on-write to the ib_logfile0 can be avoided:

SET GLOBAL innodb_log_write_ahead_size=1048576;

This parameter must be a power of 2 between the physical block size and innodb_log_buffer_size, and innodb_log_file_size must be an integer multiple of it. Any attempt to set an incorrect value will be rejected with an error. On SET GLOBAL innodb_log_file_size (log resizing), innodb_log_write_ahead_size may silently be set to a smaller value.

How can this PR be tested?

./mtr innodb.log_file_size_online

On many 64-bit Linux platforms, you should ensure that the test is not run on /dev/shm, so that pread and pwrite based log access will be used instead of “fake PMEM”. Another tmpfs or shmfs location is fine:

rm -fr var
mkdir /run/user/"$UID"/var
ln -s /run/user/"$UID"/var .
./mtr innodb.log_file_size_online

Alternatively, this can be tested on /dev/shm when building cmake -DWITH_INNODB_PMEM=OFF.

Some stress testing with SET GLOBAL of innodb_log_file_size and innodb_log_write_ahead_size would be very useful. The server should run a write heavy workload and be killed and restarted.

Additionally, mariadb-backup --backup should be tested while SET GLOBAL innodb_log_write_ahead_size is being executed. (Remember that backup is expected to hang if SET GLOBAL innodb_log_file_size is executed, because it would fail to switch to track the resized log file.)

All testing should be conducted while the InnoDB redo log interface is not memory-mapped.

Basing the PR against the correct MariaDB version

  • [ ] This is a new feature and the PR is based against the latest MariaDB development branch.
  • [x] This is a bug fix and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • [x] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • [ ] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

dr-m avatar Jun 11 '24 14:06 dr-m

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

CLAassistant avatar Jun 11 '24 14:06 CLAassistant

The experiment ef96260d56853357a97e93cde0bee393a619b4fc seems to break things massively.

dr-m avatar Jun 24 '24 14:06 dr-m

While configuring a new MariaDB instance running on ZFS I have stumbled over the parameter innodb-log-write-ahead-size.

According to this guide, we should set the parameter to match the recordsize of the zfs dev (128k in our case):

innodb_log_write_ahead_size=16384: In order to prevent torn pages and avoid read-on-write overhead, we should set this to the underlying file system block size. If we are keeping the InnoDB logs in the data directory (as is the default), we should set this to what we set the ZFS recordsize to.

I recognize the current max value of the parameter in 10.11.9-MariaDB-deb12-log is 4096.

As I think their recommendation makes sense it would be great if larger sizes (128k+) could be supported in future releases.

jkrauss82 avatar Nov 04 '24 16:11 jkrauss82