zfs icon indicating copy to clipboard operation
zfs copied to clipboard

zpool labelclear -f refuses to clear zfs labels

Open avoiceofreason opened this issue 1 month ago • 2 comments

Ubuntu 25.04 Linux 6.14.0-36-generic Intel X64 ZFS Version: zfs-2.3.1-1ubuntu2 zfs-kmod-2.3.1-1ubuntu2

Situation:

PC crashed biggly and left 1 disk in a 4 disk mirror/stripe pool as degraded.

I'm not sure what happened to the disk, certainly no read/write errors but the crash had corrupted 1 or more of the labels because zpool status was identifying the degraded disk as "sda" and not the by-id disk name.

Before doing anything I decided to offline the disk so that I could check it out, clear down any formatting on it and then try and replace it (as the same disk). Disk checked out ok and I used sfdisk --delete, wipefs -a, and sg-disk --zap-all to clean off the disk.

Using:

sudo zpool replace -f tank sdc /dev/disk/by-id/ata-WDC_123456789 or sudo zpool replace -f tank 4331676103293860827 /dev/disk/by-id/ata-WDC_123456789 or even sudo zpool replace -f tank ata-WDC_123456789

refused to replace the disk with blah blah ...... disk busy (it wasn't busy)

Some reading later I came to the conclusion that despite trying to clear the disk down it still had 1 or more ZFS labels intact and zpool was refusing to replace even though I was trying to force.

I found the zpool labelclear command and tried that:

sudo zpool labelclear -f /dev/disk/by-id/ata-WDC_123456789

but always get failed to clear label for /dev/disk/by-id/ata-WDC_123456789

So now in a doom loop. Can't replace disk which might have a label, can't clear labels.

More reading and eventually came up with some dd commands which "should" overwrite the label areas (BTW can't find any clear ZFS documentation on where or how to find the labels on a disk)

sudo blockdev --getsize64 /dev/disk/by-id/ata-WDC_123456789
960197124096

sudo dd if=/dev/zero of=/dev/disk/by-id/ata-WDC_123456789 bs=1M count=34 oflag=direct sudo dd if=/dev/zero of=/dev/disk/by-id/ata-WDC_123456789 bs=1M count=34 seek=$((960197124096 / 1048576 - 34)) oflag=direct sudo dd if=/dev/zero of=/dev/disk/by-id/ata-WDC_123456789-part1 bs=1M count=34 oflag=direct sudo dd if=/dev/zero of=/dev/disk/by-id/ata-WDC_123456789-part1 bs=1M count=34 seek=$((960197124096 / 1048576 - 34)) oflag=direct

This seemed to work and I was then able to replace the disk successfully. (I read about some people zeroing the whole disk solved the problem but I didn't have 12 hours)

Please note that I have tried to replicate this issue with a test pool on other disks, but have not been able to corrupt a disk in the same way that my disk was corrupted, so replacing a disk seems to work under normal condition. However there must be a scenario where a single disk corruption causes the problem above.

However if you setup say a two disk mirror pool and offline a disk then the zpool labelclear -f always comes back with "failed to clear".

There needs to be a nuclear "-F" option to just force zpool to clear the labels on a disk even if appears to be a pool member.

Many thanks.

P.S. If nothing else this issue post might help a relatively low skilled ZFS user like me in the future P.P.S I lost no data so ZFS has again proved to be as resilient as I hoped

avoiceofreason avatar Dec 08 '25 15:12 avoiceofreason

However if you setup say a two disk mirror pool and offline a disk then the zpool labelclear -f always comes back with "failed to clear".

Note: it will print "fail to clear" if the labels are already wiped:

$ sudo ./zpool labelclear `pwd`/file
use '-f' to override the following error:
/home/hutter/zfs/file is a member of exported pool "tank"

# clear label
$ sudo ./zpool labelclear -f `pwd`/file

# clear label again - it's already cleared
$ sudo ./zpool labelclear -f `pwd`/file
failed to clear label for /home/hutter/zfs/file

I admit it's a pretty bad error message. It should say "label already cleared" or something like that.

tonyhutter avatar Dec 10 '25 01:12 tonyhutter

OK, think I've found my confusion:

When using zpool labelclear do NOT use the full device reference.

This doesn't work:

sudo zpool labelclear -f /dev/disk/by-id/ata-WDC_123456666
failed to clear label for /dev/disk/by-id/ata-WDC_123456666

This does work:

sudo zpool labelclear -f ata-WDC_123456666

However in my original situation the reference to the disk had changed from ata-WDC_123456666 to sdc in zpool status (due to the crash corruption??), so not sure even the second option would have worked.

How does zpool labelclear reference ata-WDC_123456666 to an actual device? what if ata-WDC_123456666 is no longer in the pool metadata?

I think I originally did a zdb and path had changed to '/dev/sdc' and devid had changed to 'sdc'

Maybe it would make sense to have a -dev option like:

sudo zpool labelclear -dev /dev/disk/by-id/ata-WDC_123456666

Which explicitly targets a physical drive device.

avoiceofreason avatar Dec 10 '25 15:12 avoiceofreason