bcachefs icon indicating copy to clipboard operation
bcachefs copied to clipboard

compatibility with mariaDB

Open kezyr opened this issue 2 years ago • 4 comments

Hello, I created a small server with bcachefs and the volume is used also as rootfs. My format command was this:

bcachefs format \
--block_size=4k \
--label=ssd.ssd1 /dev/nvme0n1p2  \
--label=ssd.ssd2 /dev/nvme1n1p2 \
--label=hdd.hdd1 /dev/sda \
--label=hdd.hdd2 /dev/sdb \
--replicas=2 \
--foreground_target=ssd \
--promote_target=ssd \
--background_target=hdd

Arch linux, Angie (nginx), some docker images - everything works.

When I installed MariaDB and try to init databases (needed before first run) I receive this kind of crash:

...
2024-02-03 14:56:17 0 [Note] InnoDB: Completed initialization of buffer pool
2024-02-03 14:56:17 0 [Note] InnoDB: Setting file './ibdata1' size to 12.000MiB. Physically writing the file full; Please wait ...
2024-02-03 14:56:17 0 [Note] InnoDB: File './ibdata1' size is now 12.000MiB.
2024-02-03 14:56:17 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: 512 bytes should have been written at 12288 from (unknown file), but got only 0. Retrying.
2024-02-03 14:56:17 0 [Warning] InnoDB: Retry attempts for writing partial data failed.
2024-02-03 14:56:17 0 [ERROR] InnoDB: Write to file ib_logfile0 failed at offset 12288, 512 bytes should have been written, only 0 were written. Operating system error number 2. Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.
2024-02-03 14:56:17 0 [ERROR] InnoDB: Error number 2 means 'No such file or directory'
2024-02-03 14:56:17 0 [Note] InnoDB: Some operating system error numbers are described at https://mariadb.com/kb/en/library/operating-system-error-codes/
2024-02-03 14:56:17 0 [ERROR] [FATAL] InnoDB: write("ib_logfile0") returned I/O error
240203 14:56:17 [ERROR] mysqld got signal 6 ;
Sorry, we probably made a mistake, and this is a bug.
...

I spent some time to investigate and I found out that if I change mariaDB settings from this default:

# The default setting of 1 is required for full ACID compliance. Logs are written and flushed to disk at each transaction commit. 
innodb_flush_log_at_trx_commit = 1

# fsync or 0: InnoDB uses the fsync() system call to flush both the data and log files. fsync is the default setting. 
innodb_flush_method = fsync

to this new setting:

# With a setting of 2, logs are written after each transaction commit and flushed to disk once per second. Transactions for which logs have not been flushed can be lost in a crash. 
innodb_flush_log_at_trx_commit = 2

# O_DIRECT_NO_FSYNC: InnoDB uses O_DIRECT during flushing I/O, but skips the fsync() system call after each write operation. 
innodb_flush_method = O_DIRECT_NO_FSYNC

Then the issue is probably solved because there is no error and I can start mariadb service. If I use innodb_flush_log_at_trx_commit anything else it crashes. I can use innodb_flush_method = nosync also it also works but everything else crashes.

I dont know if this is a bcachefs issue or mariaDB issue. I created also issue in mariaDB. But it seems there is some issue with fsync() and probably some compatibility issue

Can I do something to find out more that would help finding what is wrong?

kezyr avatar Feb 03 '24 15:02 kezyr

In mariadb issue we found out that instead of my original mariadb settings if I set

innodb_log_file_buffering=ON

then the issue is gone. It looks like there is some issue/incompatibility with O_DIRECT because innodb_log_file_buffering disables O_DIRECT.

kezyr avatar Feb 03 '24 22:02 kezyr

@kezyr Would you mind retesting? I just switched over to current bcachefs/master and I cannot seem to reproduce this problem anymore.

noradtux avatar May 28 '24 21:05 noradtux

I just got bitten by this on 6.9.4.

As @kezyr suggested

innodb_log_file_buffering=ON

works around the issue.

clipcarl avatar Jun 15 '24 02:06 clipcarl

This is interesting because I've run mariadb on bcachefs on this very server in the past. I switched the filesystem used by mariadb from bcachefs to xfs some months ago due to another issue unrelated to mariadb. Today I switched the filesystem back from xfs to bcachefs and immediately got hit with the issue for the first time.

clipcarl avatar Jun 15 '24 03:06 clipcarl