fio icon indicating copy to clipboard operation
fio copied to clipboard

Add an option to open with O_EXCL

Open keithr-mext opened this issue 1 year ago • 5 comments

On linux, it can be difficult to determine if a drive is in use by other parts of the system, since it can be mounted in a non-default namespace or directly opened by an application. (This page describes some of the many ways a drive can be in use that can be difficult to find.) However, opening a block device with the O_EXCL flag causes the kernel to return EBUSY if the drive is in use.

From open(2):

In general, the behavior of O_EXCL is undefined if it is used without O_CREAT. There is one exception: on Linux 2.6 and later, O_EXCL can be used without O_CREAT if pathname refers to a block device. If the block device is in use by the system (e.g., mounted), open() fails with the error EBUSY.

There should be an option to tell fio to use this flag, similar to the direct option.

keithr-mext avatar Sep 26 '24 19:09 keithr-mext

Fio already checks if the device is mounted, and will fail a write workload unless allow_mounted_write has been set to 1. It defaults to 0. O_EXCL doesn't really bring much otherwise, as it will not prevent an exclusive open if someone else already has the device opened read+write, EXCEPT if that was also done with O_EXCL. And vice versa, an O_EXCL open will also not prevent another read/write open.

axboe avatar Sep 26 '24 21:09 axboe

I just clobbered a device by running fio on the base device (/dev/nvme1n1) while some partitions were mounted (the system was rebooted and linux shuffled the block device numbers; before the reboot nvme0 was the root disk and nvme1 was my test disk; I didn't notice that they had been swapped by the reboot.) Fio's mounted device check didn't catch that, because I gave it /dev/nvme1n1 rather than /dev/nvme1n1p1 as the filename; the mounted check would have worked in the latter case. Opening the base device with O_EXCL would have failed with EBUSY because of the mounted partitions.

It's also possible on linux to mount a device in another namespace, which fio's device_is_mounted() check won't catch, but O_EXCL will. O_EXCL won't catch every possible conflict, as you point out, but it will catch several that the current check won't.

keithr-mext avatar Sep 26 '24 21:09 keithr-mext

That's a good point. I guess we could just augment the writeable open check with setting O_EXCL as well. That should provide more coverage, would've caught your case.

axboe avatar Sep 26 '24 21:09 axboe

I have a fork in which I'm working on fixing this; what I've got currently is a exclusive-open option, but if it would be better to handle it a different way, such as making it always use O_EXCL for write workloads, let me know.

keithr-mext avatar Sep 26 '24 21:09 keithr-mext

I don't like adding yet another option for this, since we already have one and it defaults to disallowing it. So I'd say if allow_mounted_write is zero, then just set O_EXCL whenever O_WRONLY or O_RDWR would be set for opening a file.

axboe avatar Sep 26 '24 21:09 axboe