zfs icon indicating copy to clipboard operation
zfs copied to clipboard

catch-22: ZFS tooling appears unable to handle duplicate UUIDs, no way to change UUIDs without importing

Open deliciouslytyped opened this issue 4 years ago • 7 comments

System information

Distribution Name | NixOS Distribution Version | N/A Linux Kernel | Linux nixos 5.3.14 #1-NixOS SMP Fri Nov 29 09:08:31 UTC 2019 x86_64 GNU/Linux ZFS Version | 0.8.2-1 SPL Version | 0.8.2-1

Describe the problem you're observing

I couldn't find any relevant issues but maybe I'm using the wrong keywords.

I have the following (simplified) layout:

$ lsblk
NAME        RO TYPE  MOUNTPOINT
sda            0 disk  
└─sda3          0 part  
  └─root      0 crypt 
sdc             0 disk  
└─sdc3          0 part  
  └─tempmnt   0 crypt 
$ blkid
/dev/mapper/root: LABEL="thepool" UUID="123" UUID_SUB="456" TYPE="zfs_member"
/dev/mapper/tempmnt: LABEL="thepool" UUID="123" UUID_SUB="456" TYPE="zfs_member"

tempmnt appears to be an old backup/mirror of root, which was created with dd, so this copies all the UUIDs. (keyword for issue searches: copied)

import cannot see the device:

$ sudo zpool import
no pools available to import
$ sudo zpool import -d /dev/mapper/tempmnt
no pools available to import

zdb, assuming I'm using it properly, cannot deal with it either:

$ sudo zdb -d -e -V -p /dev/mapper/tempmnt thepool
zdb: can't open 'thepool': File exists

zpool reguid appears to be the correct way to fix UUIDs, but I can only do that if it's already imported...

Is there any other approach that I can do on this machine, without

  • starting a VM to edit the duplicate disk
  • messing with the root pool that my machine is running on ?

Describe how to reproduce the problem

I have not attempted to make a second, isolated, reproducer yet.

deliciouslytyped avatar Dec 26 '20 13:12 deliciouslytyped

Thanks for reporting this. One workaround which should work would be to remove the /dev/mapper/ links for one of the pools and then run zpool import -d /dev/mapper tempmnt. This way it'll only detect one of the pools allowing it to be imported so zpool reguid can be run. Then the /dev/mapper links for the other pool can be recreated. Internally ZFS doesn't allow multiple pools with the same guid to be imported at the same time.

behlendorf avatar Dec 28 '20 20:12 behlendorf

We've run into this before when working to fix zpools on RBD volumes with snaps. Gets fun when you're importing the same pool several times. We've a script somewhere to do this, but the basic premise is to make a subdirectory for each instance of the pool, symlink the mounted RBD snap into that subdir, and import pools directory by directory with reguids (the underlying snaps are reverted when its all said and done so they get their GUID back). @behlendorf: are there any logical blockers to having a -t like flag for zpool import which would assign a temp GUID for the instance? Or is the GUID written into the txgs somehow which would mess up future operations under the real ID?

sempervictus avatar Dec 29 '20 15:12 sempervictus

@sempervictus that probably would be possible as long as some care is taken to continue writing the original guid to the label whenever they're updated. However there's another problem, zpool import relies on the pool guid in each label to determine which block devices are associated with a given pool. When duplicates devices with the same pool and vdev guids are detected there's no way to tell which devices belong together. We could definitely generate a better error message and show both pools, but I think only the system owner can really knows what belongs with what.

behlendorf avatar Dec 29 '20 17:12 behlendorf

Thanks for the replies, I breathed a small sigh of relief that it's not just me being a dummy and that this could get taken seriously. :)

I will try the workaround in the following days.

I don't know anything about the internals of ZFS, but since it doesn't seem to have been designed to handle duplicate UIDs (citation needed/ the evidence is empirical), my naive viewpoint suggests the safest thing sounds like denying a colliding import, but having a mechanism that lets you reguid. That way you can solve the issue offline?

The first issue that comes to mind is "crossing the streams" in some manner; would writes get mixed up between devices or something?

deliciouslytyped avatar Dec 29 '20 18:12 deliciouslytyped

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Dec 29 '21 23:12 stale[bot]

Bump, I need to restore from a dd backup of a ZFS pool and am running into this issue. I'm unable to import the backup drive because it has the same UUID (I also cant unmount/detatch the original pool because it's on a live production server that I don't want to take down).

/dev/sda1: LABEL="ssd-mango" UUID="4444263869676569426" UUID_SUB="13155975866721406338" BLOCK_SIZE="512" TYPE="zfs_member" PARTLABEL="zfs-41dd05c063d4a550" PARTUUID="42ca30f7-a87b-
/dev/sdb1: LABEL="ssd-mango" UUID="4444263869676569426" UUID_SUB="13155975866721406338" BLOCK_SIZE="512" TYPE="zfs_member" PARTLABEL="zfs-41dd05c063d4a550" PARTUUID="42ca30f7-a87b-ce4c-ae85-6692c9ee1cb5"

Is there a workaround to change the UUID of the clone that doesn't involve unmounting the original pool?

pirate avatar Jun 12 '24 01:06 pirate

Mount it on another machine? If you cant remove it physically, iSCSI shenanigans?

deliciouslytyped avatar Jun 14 '24 00:06 deliciouslytyped