zfs
zfs copied to clipboard
Regression: Temporary USB issues cause ZFS lockup until forced reboot, and sometimes data loss!
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Gentoo |
| Distribution Version | amd64 stable |
| Linux Kernel | 5.10.33 |
| Architecture | amd64 |
| ZFS Version | 2.0.4 |
| SPL Version | 2.0.4 |
Describe the problem you're observing
Occasionally the USB bus has issues and one or more USB block devices disappear. It quickly recovers, but ZFS gets stuck and never recovers until I force an unclean reboot (clean shutdown can't succeed).
At least sometimes, I can still access the filesystem, but with high latency. Every time with 2.0.4, however, zfs snapshot and zpool clear hang indefinitely.
Note: 0.8.6 worked okay. zpool clear allowed me to close and re-open the LUKS device, then another zpool clear restored ZFS to full functionality.
Describe how to reproduce the problem
Wait for USB bus to have issue. I presume I could just unplug the drive too, but haven't confirmed that.
Include any warning/errors/backtraces from the system logs
pool: zpool0
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
scan: scrub repaired 106M in 10:44:56 with 0 errors on Mon May 3 16:18:15 2021
config:
NAME STATE READ WRITE CKSUM
zpool0 DEGRADED 0 0 0
mirror-0 DEGRADED 14 19 0
hd2018e2 ONLINE 7 23 0
hd2018b2 FAULTED 3 0 0 too many errors
errors: List of errors unavailable: pool I/O is currently suspended
May 06 10:01:02 [kernel] zio pool=zpool0 vdev=/dev/mapper/hd2018e2 error=5 type=1 offset=4627439972352 size=4096 flags=180980
May 06 10:01:02 [kernel] zio pool=zpool0 vdev=/dev/mapper/hd2018e2 error=5 type=1 offset=270336 size=8192 flags=b08c1
May 06 10:01:02 [kernel] zio pool=zpool0 vdev=/dev/mapper/hd2018e2 error=5 type=1 offset=8893439180800 size=4096 flags=180980
May 06 10:01:02 [kernel] zio pool=zpool0 vdev=/dev/mapper/hd2018e2 error=5 type=1 offset=9999753682944 size=8192 flags=b08c1
May 06 10:01:02 [kernel] zio pool=zpool0 vdev=/dev/mapper/hd2018e2 error=5 type=1 offset=9999753945088 size=8192 flags=b08c1
May 06 10:01:02 [kernel] zio pool=zpool0 vdev=/dev/mapper/hd2018e2 error=5 type=1 offset=8835277295616 size=4096 flags=180980
May 06 10:01:02 [kernel] WARNING: Pool 'zpool0' has encountered an uncorrectable I/O failure and has been suspended.
1620241269 spa_history.c:295:spa_history_log_sync(): command: zfs destroy -d -r zpool0/dev_guix@zfs-auto-snap_hourly-2021-05-04-1901
1620241271 spa_history.c:328:spa_history_log_sync(): ioctl destroy_snaps
1620241361 spa.c:8187:spa_async_request(): spa=zpool0 async request task=4
...
1620284388 spa.c:8187:spa_async_request(): spa=zpool0 async request task=4
1620284394 vdev.c:128:vdev_dbgmsg(): disk vdev '/dev/mapper/hd2018e2': failed probe
1620284394 zio.c:3509:zio_dva_allocate(): zpool0: metaslab allocation failure: zio ffff9e8052bfe4e0, size 20480, error 28
1620284394 zio.c:3509:zio_dva_allocate(): zpool0: metaslab allocation failure: zio ffff9e80e073ca00, size 28672, error 28
1620284394 zio.c:3509:zio_dva_allocate(): zpool0: metaslab allocation failure: zio ffff9e824c2f8000, size 2048, error 28
Note: 0.8.6 worked okay
can you cofirm there was NO kernel update with the zfs update?
if the bus has hiccups and data flow to both devices gets disturbed, it's quite normal that your pool gets suspended, as we can see there is errors for both devices.
i think it would make sense to have a look at kernel dmesg to decide if this is a zfs issue at all
can you cofirm there was NO kernel update with the zfs update?
I only upgraded to ZFS 2.0.4 because my hardware upgrade required me to upgrade Linux from 4.19 to 5.10.
if the bus has hiccups and data flow to both devices gets disturbed, it's quite normal that your pool gets suspended, as we can see there is errors for both devices.
The issue is not that the pool got suspended, but that ZFS completely locked up and could not recover once access to the disk(s) was restored.
Intel USB is in my experience unavoidably flaky - but it almost always is recoverable one way or another without a reboot. A filesystem (mere software) shouldn't impose additional flakiness by requiring an unclean reboot after the hardware side is addressed. (And indeed, ZFS 0.8.4 usually recovered fine with a few zpool clears)
so, if you did a kernel upgrade from 4.19 to 5.10 at the same time it's hard to decide if your problem is introduced by openzfs itself.
anyhow, you are right, i think zfs should never completely lock up in an uncontrolled fashion or get into some completely weird state (requiring reboot) because of storage hiccups
i would at least test after re-attaching the sticks if access to them is fine, i.e. i would do a read-test via "dd if=/dev/usbblockdev of=/dev/null bs=1024k" and have a look with iostat
so, if you did a kernel upgrade from 4.19 to 5.10 at the same time it's hard to decide if your problem is introduced by openzfs itself.
I wonder how practical it would be to minimally patch 0.8.4 to work with Linux 5.10... But considering only OpenZFS is affected, it seems unlikely to be a Linux-side issue?
i would at least test after re-attaching the sticks if access to them is fine, i.e. i would do a read-test via "dd if=/dev/usbblockdev of=/dev/null bs=1024k" and have a look with iostat
I was able to hexdump them fine.
Are you running zfs atop LUKS or the other way around?
ZFS shouldn't have tried to access the faulted device if it was indeed in such a state when you tried to remove the snapshot. Don't think your logs are in chronological order though.
ZFS atop LUKS. (Not sure what the other way around would look like?)
The snapshots are automatic. I do wonder if I caught it in time, if zpool clear would recover before any snapshot operations are attempted. (I have my dmesg on a second monitor so hopefully I will notice quicker next time...)
You can do LUKS on zvols, however since you're doing it the other way zfs has no idea what the state of the hardware is.
Happened again, and this time corrupted the zpool so it wouldn't import after rebooting. Had to use -T <old txg> on the secondary drive (manually offlined at the time) to import at all. Seems irreparably damaged. :/
Losing the entire pool simply because the drive drops off the bus seems entirely unreasonable. Any other filesystem would be a fsck or journal recovery and move on fine. All it takes is not leaving the disk in a state where it can't recover... :(
Related: https://github.com/openzfs/zfs/issues/2878
+1 for a fix. I ejected a USB flash drive before a zpool export and left zfs unusable until a hard reboot (power button), commands such as zpool status would just hang forever.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
@behlendorf: Bad Bot.
I believe I might have encountered this issue with zfs-dkms 2.2.0-2 on Arch Linux.
I am experiencing very similar symptoms with a usb-connected external drive. I was suspecting that maybe the drive would spin down and that was interfering with ZFS when it tried to communicate with an un-spun disk, creating this broken state.
zpool export -F and zpool clear -F and just about every other single command and set of options I have seen suggested will hang indefinitely. A reboot of the system seems to be the only way to fix the issue and start using the pool again.
EDIT: I don't think this specific issue is related to the problem I had, but I will share my solution in case anyone else finds this.
I was able to create a workaround / fix for my problem. Basically, a script periodically reads some bytes from the drive in the background to keep it from going to sleep. It works quite well.
Command:
dd if=/dev/sda bs=4096 count=1 of=/dev/null iflag=direct
Crontab entry:
*/10 * * * * /home/brooksvb/scripts/ex-drive-keep-alive.sh
I expanded the script a little bit with ChatGPT to avoid running the command when the specific drive was not present. It's on my other laptop right now. If I remember later, I will share the full script.
Got the same issue