Storing qemu images on bcachefs breaks xfs in VM
Hi there,
thanks for fixing #717 so quickly. I only recently upgraded my kernel and I have not seen the issue anymore :+1:
Now for this bug report, even with latest mainline kernel I can still reproduce my issue with storing VM images on bcachefs. This is probably a fringe usecase, but in the end I guess bcachefs should support this to be a full featured FS.
Summary: The Proxmox Hypervisor currently has no native driver for bcachefs, but it'd still be nice to use the normal QEMU file storage on a bcachefs filesystem. So I tried to set this up in Proxmox and install a Centos VM onto bcachefs storage, but the installation fails. I then tried to get a slightly simpler reproduction case.
VM Host
OS: Proxmox VE 8.2.7 / Debian 12 Bookworm
Kernel: Ubuntu Mainline PPA 6.12.1 (6.12.1-061201-generic)
Bcachefs Tools: v1.13.0 tag build from source
bcachefs show-super:
Device: HGST HDN728080AL
External UUID: cca5bc65-fe77-409d-a9fa-465a6e7f4eae
Internal UUID: ca668445-d05c-47f8-8b05-92c30245a167
Magic number: c68573f6-66ce-90a9-d96a-60cf803df7ef
Device index: 0
Label: NAS_DATA
Version: 1.13: inode_has_child_snapshots
Version upgrade complete: 1.13: inode_has_child_snapshots
Oldest version on disk: 1.4: member_seq
Created: Fri Jul 5 14:09:12 2024
Sequence number: 128
Time of last write: Sat Nov 30 20:34:55 2024
Superblock size: 7.45 KiB/1.00 MiB
Clean: 0
Devices: 5
Sections: members_v1,crypt,replicas_v0,disk_groups,clean,journal_seq_blacklist,journal_v2,counters,members_v2,errors,ext,downgrade
Features: zstd,journal_seq_blacklist_v3,reflink,new_siphash,inline_data,new_extent_overwrite,btree_ptr_v2,extents_above_btree_updates,btree_updates_journalled,reflink_inline_data,new_varint,journal_no_flush,alloc_v2,extents_across_btree_nodes
Compat features: alloc_info,alloc_metadata,extents_above_btree_updates_done,bformat_overflow_done
Options:
block_size: 4.00 KiB
btree_node_size: 256 KiB
errors: continue [fix_safe] panic ro
metadata_replicas: 2
data_replicas: 2
metadata_replicas_required: 1
data_replicas_required: 1
encoded_extent_max: 64.0 KiB
metadata_checksum: none [crc32c] crc64 xxhash
data_checksum: none [crc32c] crc64 xxhash
compression: zstd
background_compression: none
str_hash: crc32c crc64 [siphash]
metadata_target: none
foreground_target: ssd
background_target: hdd
promote_target: ssd
erasure_code: 0
inodes_32bit: 1
shard_inode_numbers: 1
inodes_use_key_cache: 1
gc_reserve_percent: 8
gc_reserve_bytes: 0 B
root_reserve_percent: 0
wide_macs: 0
promote_whole_extents: 1
acl: 1
usrquota: 0
grpquota: 0
prjquota: 0
journal_flush_delay: 1000
journal_flush_disabled: 0
journal_reclaim_delay: 100
journal_transaction_names: 1
allocator_stuck_timeout: 30
version_upgrade: [compatible] incompatible none
nocow: 0
members_v2 (size 736):
Device: 0
Label: hdd1 (1)
UUID: 141032c8-2583-4306-b4c1-412696d46be5
Size: 7.28 TiB
read errors: 0
write errors: 0
checksum errors: 0
seqread iops: 0
seqwrite iops: 0
randread iops: 0
randwrite iops: 0
Bucket size: 256 KiB
First bucket: 0
Buckets: 30523541
Last mount: Sat Nov 30 20:27:20 2024
Last superblock write: 128
State: rw
Data allowed: journal,btree,user
Has data: journal,user,cached
Btree allocated bitmap blocksize: 1.00 B
Btree allocated bitmap: 0000000000000000000000000000000000000000000000000000000000000000
Durability: 1
Discard: 0
Freespace initialized: 1
Device: 1
Label: hdd2 (2)
UUID: d038124b-d4a5-4deb-bdd1-eb423c9189c8
Size: 7.33 TiB
read errors: 0
write errors: 0
checksum errors: 0
seqread iops: 0
seqwrite iops: 0
randread iops: 0
randwrite iops: 0
Bucket size: 256 KiB
First bucket: 0
Buckets: 30758228
Last mount: Sat Nov 30 20:27:20 2024
Last superblock write: 128
State: rw
Data allowed: journal,btree,user
Has data: journal,user,cached
Btree allocated bitmap blocksize: 1.00 B
Btree allocated bitmap: 0000000000000000000000000000000000000000000000000000000000000000
Durability: 1
Discard: 0
Freespace initialized: 1
Device: 2
Label: hdd3 (3)
UUID: 09811319-852f-4ac1-a1a9-8aef619df346
Size: 7.28 TiB
read errors: 0
write errors: 0
checksum errors: 0
seqread iops: 0
seqwrite iops: 0
randread iops: 0
randwrite iops: 0
Bucket size: 256 KiB
First bucket: 0
Buckets: 30523541
Last mount: Sat Nov 30 20:27:20 2024
Last superblock write: 128
State: rw
Data allowed: journal,btree,user
Has data: journal,user,cached
Btree allocated bitmap blocksize: 1.00 B
Btree allocated bitmap: 0000000000000000000000000000000000000000000000000000000000000000
Durability: 1
Discard: 0
Freespace initialized: 1
Device: 3
Label: ssd1 (5)
UUID: 074844ac-70c4-4cd7-a302-fa1946985849
Size: 631 GiB
read errors: 0
write errors: 0
checksum errors: 0
seqread iops: 0
seqwrite iops: 0
randread iops: 0
randwrite iops: 0
Bucket size: 256 KiB
First bucket: 0
Buckets: 2582576
Last mount: Sat Nov 30 20:27:20 2024
Last superblock write: 128
State: rw
Data allowed: journal,btree,user
Has data: journal,btree,user,cached
Btree allocated bitmap blocksize: 32.0 MiB
Btree allocated bitmap: 0000000000000000000000001111111111111111111111111111111111111111
Durability: 1
Discard: 1
Freespace initialized: 1
Device: 4
Label: ssd2 (6)
UUID: 4dd47f69-b955-4de5-b9b9-2a6dc60ca16c
Size: 165 GiB
read errors: 0
write errors: 0
checksum errors: 0
seqread iops: 0
seqwrite iops: 0
randread iops: 0
randwrite iops: 0
Bucket size: 256 KiB
First bucket: 0
Buckets: 674860
Last mount: Sat Nov 30 20:27:20 2024
Last superblock write: 128
State: rw
Data allowed: journal,btree,user
Has data: journal,btree,user,cached
Btree allocated bitmap blocksize: 8.00 MiB
Btree allocated bitmap: 0000000000000000000000111111111111111111111111111111111111111111
Durability: 1
Discard: 1
Freespace initialized: 1
errors (size 40):
fs_usage_cached_wrong 1 Mon Oct 7 16:09:57 2024
fs_usage_replicas_wrong 2 Mon Oct 7 16:09:57 2024
The VM was created in Proxmox using default settings for storage and image. This runs qemu kvm like this:
/usr/bin/kvm -id 103 -name test,debug-threads=on -no-shutdown -chardev socket,id=qmp,path=/var/run/qemu-server/103.qmp,server=on,wait=off -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5 -mon chardev=qmp-event,mode=control -pidfile /var/run/qemu-server/103.pid -daemonize -smbios type=1,uuid=073b59e0-198d-4896-afae-9e1982164f4a -smp 4,sockets=1,cores=4,maxcpus=4 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg -vnc unix:/var/run/qemu-server/103.vnc,password=on -cpu qemu64,+aes,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+pni,+popcnt,+sse4.1,+sse4.2,+ssse3 -m 2048 -object iothread,id=iothread-virtioscsi0 -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device pci-bridge,id=pci.3,chassis_nr=3,bus=pci.0,addr=0x5 -device vmgenid,guid=3160b218-4ba2-42e6-bfb7-5ef0e4df3131 -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device VGA,id=vga,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on -iscsi initiator-name=iqn.1993-08.org.debian:01:907ae15e667 -drive file=/var/lib/pve/local-btrfs/template/iso/Fedora-Workstation-Live-x86_64-41-1.4.iso,if=none,id=drive-ide2,media=cdrom,aio=io_uring -device ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=100 -device virtio-scsi-pci,id=virtioscsi0,bus=pci.3,addr=0x1,iothread=iothread-virtioscsi0 -drive file=/mnt/data/services/pve//images/103/vm-103-disk-0.qcow2,if=none,id=drive-scsi0,format=qcow2,cache=none,aio=io_uring,detect-zeroes=on -device scsi-hd,bus=virtioscsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=101 -netdev type=tap,id=net0,ifname=tap103i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on -device virtio-net-pci,mac=BC:24:11:74:0B:4F,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256,bootindex=102 -machine type=pc+pve0
VM
OS: Fedora 41 workstation Live CD
Kernel: 6.11.4-301.fc41.x86_64
- Format
/dev/sdausing fdisk and create one partition. This works fine. - Run mkfs.xfs. This fails:
root@localhost-live:~# mkfs.xfs /dev/sda1
meta-data=/dev/sda1 isize=512 agcount=4, agsize=2097024 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=1
= reflink=1 bigtime=1 inobtcount=1 nrext64=1
data = bsize=4096 blocks=8388096, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=16384, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_sb bno 0x0/0x1, err=121
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on (unknown) bno 0x1fff838/0x2, err=121
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_sb bno 0x0/0x1, err=121
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_agf bno 0x1/0x1, err=121
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_agfl bno 0x3/0x1, err=121
mkfs.xfs: pwrite failed: Remote I/O error
libxfs_bwrite: write failed on xfs_agi bno 0x2/0x1, err=121
mkfs.xfs: writing AG headers failed, err=121
After this, the following errors can be found in the VM dmesg:
[ 740.241536] sd 2:0:0:0: [sda] tag#212 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 740.241540] sd 2:0:0:0: [sda] tag#212 Sense Key : Illegal Request [current]
[ 740.241542] sd 2:0:0:0: [sda] tag#212 Add. Sense: Invalid field in cdb
[ 740.241544] sd 2:0:0:0: [sda] tag#212 CDB: Write(10) 2a 00 00 00 08 00 00 00 01 00
[ 740.241545] critical target error, dev sda, sector 2048 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[ 740.242534] sd 2:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 740.242538] sd 2:0:0:0: [sda] tag#62 Sense Key : Illegal Request [current]
[ 740.242540] sd 2:0:0:0: [sda] tag#62 Add. Sense: Invalid field in cdb
[ 740.242542] sd 2:0:0:0: [sda] tag#62 CDB: Write(10) 2a 00 02 00 00 38 00 00 02 00
[ 740.242543] critical target error, dev sda, sector 33554488 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[ 740.242740] sd 2:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 740.242742] sd 2:0:0:0: [sda] tag#0 Sense Key : Illegal Request [current]
[ 740.242752] sd 2:0:0:0: [sda] tag#0 Add. Sense: Invalid field in cdb
[ 740.242754] sd 2:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 00 00 08 00 00 00 01 00
[ 740.242755] critical target error, dev sda, sector 2048 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[ 740.244685] sd 2:0:0:0: [sda] tag#214 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 740.244687] sd 2:0:0:0: [sda] tag#214 Sense Key : Illegal Request [current]
[ 740.244689] sd 2:0:0:0: [sda] tag#214 Add. Sense: Invalid field in cdb
[ 740.244690] sd 2:0:0:0: [sda] tag#214 CDB: Write(10) 2a 00 00 00 08 01 00 00 01 00
[ 740.244691] critical target error, dev sda, sector 2049 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[ 740.244842] sd 2:0:0:0: [sda] tag#215 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 740.244843] sd 2:0:0:0: [sda] tag#215 Sense Key : Illegal Request [current]
[ 740.244844] sd 2:0:0:0: [sda] tag#215 Add. Sense: Invalid field in cdb
[ 740.244845] sd 2:0:0:0: [sda] tag#215 CDB: Write(10) 2a 00 00 00 08 03 00 00 01 00
[ 740.244846] critical target error, dev sda, sector 2051 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
[ 740.244980] sd 2:0:0:0: [sda] tag#216 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 740.244981] sd 2:0:0:0: [sda] tag#216 Sense Key : Illegal Request [current]
[ 740.244983] sd 2:0:0:0: [sda] tag#216 Add. Sense: Invalid field in cdb
[ 740.244984] sd 2:0:0:0: [sda] tag#216 CDB: Write(10) 2a 00 00 00 08 02 00 00 01 00
[ 740.244984] critical target error, dev sda, sector 2050 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 2
There are no messages in the VM Host dmesg.
Interestingly, other filesystems seem to work better. Manually creating a ext4 fs, mounting and creating / deleting files worked. So I tried to do a default fedora installation, which uses only btrfs and ext4. This works as well. It installs just fine and the installed OS boots. I did not do any further testing though.
So this seems to be somewhat xfs specific. If there's any additional info that could help to debug this further, please let me know.
I experience this kind of trouble when using qemu with qcow2 with many different filesystems. When using converted raw images all problems go away. But then of course I loose all qcow2 goodies, the worst is backup options online.
I've tried using qcaw2 images for VMs with noqcow option set in bcachefs on filesystem and file level but that was making whole filesystem crash (well bugs there are) so things got even worse with that.
I'm searching for an option here, will try if more sofisticated images with raw backing file perhaps will solve the issue for now.
anyone know if this bug is still happening?
Hey Kent, thanks for looking into this. I can still reproduce the issue with kernel 6.14.0-2-pve. Error messages, dmesg etc. are all exactly the same. If it helps, I can update to 6.16 from ubuntu mainline ppa and test it next weekend.
Please do let me know the 6.16 results; if it still happens there I'll see what I can see
Still happens on 6.16.0-061600-generic Ubuntu PPA kernel.
So far, I only tested with the Proxmox GUI. When trying to produce a simple qemu reproduction case, I first couldn't reproduce the bug. Some more debugging then showed this only happens with the cache=none option. Here's a 'simple' test case:
# In a folder on a bcachefs mount
# Use SystemRescue, as we can boot everything using the serial console there
wget https://fastly-cdn.system-rescue.org/releases/12.01/systemrescue-12.01-amd64.iso -O boot.iso
qemu-img create -f qcow2 disk.qcow2 1G
qemu-system-x86_64 \
-enable-kvm \
-m 2048 \
-drive file=disk.qcow2,format=qcow2,cache=none \
-cdrom boot.iso \
-boot d \
-serial mon:stdio \
-display none
# Select "Boot SystemRescue with serial console"
mkfs.xfs /dev/sda
This produces the following output:
meta-data=/dev/sda isize=512 agcount=4, agsize=65536 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=1
= reflink=1 bigtime=1 inobtcount=1 nrext64=1
= exchange=0
data = bsize=4096 blocks=262144, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1, parent=0
log =internal log bsize=4096 blocks=16384, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
mkfs.xfs: pwrite failed: Input/output error
libxfs_bwrite: write failed on (unknown) bno 0x1fff00/0x100, err=5
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Input/output error
libxfs_bwrite: write failed on (unknown) bno 0x0/0x100, err=5
mkfs.xfs: Releasing dirty buffer to free list!
found dirty buffer (bulk) on free list!
mkfs.xfs: pwrite failed: Input/output error
libxfs_bwrite: write failed on xfs_sb bno 0x0/0x1, err=5
mkfs.xfs: Releasing dirty buffer to free list!
mkfs.xfs: libxfs_device_zero write failed: Input/output error
The QEMU documentation does not say much about cache modes (search for cache=cache). The Proxmox Wiki says cache=none makes qemu use O_DIRECT semantics. So maybe it's an issue with direct IO?
I'd also like to add that this problem does not seem to be XFS-specific. mkfs.ext4 hangs on Writing superblocks and filesystem accounting information: and mkfs.fat exits with unable to synchronize /dev/sda:Input/output error.
Side note: replacing mkfs calls with
echo 1 > /dev/sda
cat /dev/sda
also seems to highlight the problem. When QCOW image is stored on a different filesystem, cat prints this 1 that was written to a fresh image file. When stored on bcachefs, nothing is printed.
EDIT:
The mkfs.xfs command also fails when using a raw image instead of a QCOW one in the VM. Additionally, the whole problem does not occur when bcachefs is used in a single-device setup. I didn't use nocow when testing.
bcachefs commit: 02f551db4b1d5a845382bb5d9b3ca29344fd7fa3 bcachefs-tools: 1.25.3+1c551b0
I can reproduce this but it only happens with 4k block sizes, not 512, and only with the cache=none flag
dd if=/dev/zero of=b0.img bs=1M count=4096
losetup --sector-size 4096 /dev/loop0 b0.img
bcachefs format /dev/loop0
mkdir /mnt/bcachefs_test
mount /dev/loop0 /mnt/bcachefs_test
pushd /mnt/bcachefs_test
wget https://fastly-cdn.system-rescue.org/releases/12.01/systemrescue-12.01-amd64.iso -O boot.iso
qemu-img create -f qcow2 disk.qcow2 1G
qemu-system-x86_64 \
-enable-kvm \
-m 2048 \
-drive file=disk.qcow2,format=qcow2,cache=none \
-cdrom boot.iso \
-boot d \
-serial mon:stdio \
-display none
Then mkfs.xfs /dev/sda in the vm
VM disks with NTFS guest file system is also affected.
I try to run winapps (which is a wrapper of dockur/windows) with an NTFS qcow2 disk in a bcachefs filesystem. the Windows installer cannot create the partition table nor the NTFS partition during installation process.
move the qcow2 disk image out to a xfs partition immediately solved the problem.
My kernel version is 6.16.4 from NixOS package.
For what it's worth I can report I've been running VMs without qcow2, just raw .img backed by bcachefs for a fairly long time and not seen any corruption. 4k block sizes too. Perhaps qcow2 is the defining factor here.
lots of things competing for top of the todo list, but this is getting up there
lots of things competing for top of the todo list, but this is getting up there
This sounds quite curious. This issue is data corruption, which I would have thought is as serious as it gets for a filesystem. But you're saying there are lots of potentially more serious things. Maybe this is just unfortunate wording
That doesn't look like a corruption to me, I see IO errors - corruptions are silent.
The other thing that makes it less concerning is that it only occurs under very specific circumstances, meaning it's unlikely users will be taken by surprise while in the middle of something. It's also not a regression, those do get jumped on right away.
yes, everyone wants their bug addressed right away, but if every single known bug was already fixed then it wouldn't be marked experimental anymore :)
@jpf91
The Proxmox Wiki says cache=none makes qemu use O_DIRECT semantics. So maybe it's an issue with direct IO?
yes, i think so.
cache=none, i.e. O_DIRECT is also known to be problematic with btrfs and datacow enabled
https://bugzilla.redhat.com/show_bug.cgi?id=1914433 https://bugzilla.kernel.org/show_bug.cgi?id=99171#c16
@koverstreet , i would like to confirm this error on latest proxmox with pve kernel 6.17 and latest bcachefs dkms package + tools
i reformatted my bcachefs mirror on two ordinary hdd (512byte sectors) with bcachefs 4k blocksize (like told at https://github.com/koverstreet/bcachefs/issues/791#issuecomment-3264226556 that it is 4k bs specific) and migrated back my virtual machine with a 5gb xfs formatted virtual disk (qcow2 with cache=none io_uring and iothread=1)
deleting a file from there immediately resulted in this io error in the vm kernel:
[ 2122.583393] sd 1:0:0:1: [sdb] tag#74 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 2122.583403] sd 1:0:0:1: [sdb] tag#74 Sense Key : Illegal Request [current]
[ 2122.583408] sd 1:0:0:1: [sdb] tag#74 Add. Sense: Invalid field in cdb
[ 2122.583411] sd 1:0:0:1: [sdb] tag#74 CDB: Write(10) 2a 08 00 50 04 fb 00 00 0a 00
[ 2122.583413] critical target error, dev sdb, sector 5244155 op 0x1:(WRITE) flags 0x29800 phys_seg 1 prio class 2
[ 2122.583482] critical target error, dev sdb, sector 5244155 op 0x1:(WRITE) flags 0x29800 phys_seg 1 prio class 2
[ 2122.583550] XFS (sdb1): log I/O error -121
[ 2122.583601] XFS (sdb1): Filesystem has been shut down due to log error (0x2).
[ 2122.583652] XFS (sdb1): Please unmount the filesystem and rectify the problem(s).
this indeed only seems to affect xfs. i also cannot reformat the virtual disk, getting "Illegal Request" immediately
it indeed seems xfs specific, i have a second identical virtual 5gb disk mounted with ext4 , which does NOT show this behaviour
no errors on host level in dmesg
bcachefs checked with
bcachefs data scrub /bcachefs
bcachefs fsck /dev/sda2
bcachefs fsck /dev/sdb2
i had no problems with 512byte bcachefs blocksize before, and i tried hard to break it without success - for example via o_direct testing tools ( see https://lore.kernel.org/linux-bcachefs/[email protected]/T/#u ). btw, any hint how we can check mirror consistency ? does scrub show when there is different data on disk1+disk2 ?
guess this has to do with direct_io and sector/block size alignment
chatgpt brought me to this one:
https://bugs.launchpad.net/fuel/+bug/1316266
and
# mkfs.xfs -b size=4k -s size=4k -f /dev/sdb1
works
whereas
# mkfs.xfs -b size=4k -s size=512 -f /dev/sdb1
fails with the errors reported
guess xfs does 512b sector size in vm by default, because qemu emulates virtual disk with 512b size by default
root@debian13-1:/# cat /sys/block/sdb/queue/logical_block_size
512
root@debian13-1:/# cat /sys/block/sdb/queue/physical_block_size
512