opensea-transport icon indicating copy to clipboard operation
opensea-transport copied to clipboard

get_Partition_Count may not be all that useful on Solaris and Derivatives using ZFS

Open szaydel opened this issue 2 years ago • 8 comments

Problem

This function pays attention to the /etc/mnttab file, looking for information about mounted partitions, but quite commonly, on systems where ZFS is the dominant filesystem, which is likely to be majority of Solaris, illumos and derivatives, there won't be anything referring to individual drives or partitions in mnttab, instead there will be ZFS mountpoints. Indeed a given drive may be in a pool and in use, but won't be seen in mnttab.

Expected behavior

I think, if this functionality is to exist on Solaris and friends, it might be necessary to enhance it with checks for ZFS labels. I am not sure about the best approach here, haven't had time to think about. Just wanted to raise this in case it was not already considered.

How to reproduce

NA

Deployment information

Anything with ZFS, which at this point would include Linux, BSD, Solaris, illumos, etc.

Additional information

No response

szaydel avatar Mar 31 '22 12:03 szaydel

I knew someone would ask about ZFS eventually 😄

The issue that was being solved when this code was added was related to erasing a drive that was still mounted. On Linux, one of our labs observed that a drive formatted with ext4 that was erased could not be reformatted through gnome disks. It would dump an error that was obscure, and even in gparted there was odd behavior. I think we had to reboot, run the erase again, then reboot once more to clear this...way more complicated than any user should have to go through. One of the cached files that tracks mounted disks was the problem, at least whatever blkid was reading seemed to continue caching this. When formatted with exFAT or FAT32, this error did not occur. So it seems like it only affected some file systems. What I found while reproducing this issue was by unmounting a partition first and then erasing, the error went away and it was easy to reformat with a new partition again. Checking the mounttab file (which varies slightly between linux, freebsd, and solaris) to find it seemed like the easiest way to catch most possible cases, but I did know zfs would be missed.

I do not know if this same kind of error will show up with ZFS partitions or not. I tested a few other partition types but only really saw the original error on ext4. I do not recall if I tested zfs at this point.

I think it would be valuable to check for ZFS, at least for detection on whether a drive has a file system and whether it is the boot device. These are currently only used by SeaTools and some limited checks around erase operations in SeaChest, so it should not prevent the tools from running, but there could be a similar error erasing a zfs disk. I will have to research the best way to accomplish checking for ZFS.

vonericsen avatar Mar 31 '22 21:03 vonericsen

I guess one of the assumptions being made is that the drive being operated on is not in use. :) I am not sure that there will be any issues like you described, though I suppose we cannot be absolutely sure. Thanks for all the context, this helps. More generally, detecting if a drive has a filesystem, or a part of a filesystem like ZFS would be valuable, I think.

szaydel avatar Mar 31 '22 21:03 szaydel

I was thinking a bit more about this, and maybe it is worthwhile making a tiny tweak to what is output by the programs, so that instead of Partition count for it is Active partition count for, or some other word to replace active. It seems worthwhile to clarify that this is not an actual partition count because it might be misleading if you believe that there is at least one or more partitions on the device.

szaydel avatar Apr 01 '22 19:04 szaydel

Some operations should not be run while a partition is mounted, and others it is ok. It depends a lot on what is being done.

I like the idea of something like "active" or maybe "detected". I will think about the wording a bit to see if there is a better way to describe it to make sure it informs without misleading anyone.

vonericsen avatar Apr 04 '22 17:04 vonericsen

Yeah, terminology here is key. Right now, I think language is quite misleading and not really helpful. Thanks for giving this a thought.

szaydel avatar Apr 04 '22 18:04 szaydel

I have been continuing to think about this issue to figure out a solution. I pushed a change to rename the variable to be slightly more clear. I decided that "active" made a lot of sense to use instead.

For ZFS, I did some reading on how it works, config files, and the various zfs and zpool commands to get an idea of what we can do. It looks like the file /etc/zfs/zpool.cache may be able to be parsed to check for drives containing a zfs pool. I will need to do more research before we try doing this as well as figuring out how to properly parse this file, but I think this could be a solution for systems using ZFS file systems.

vonericsen avatar Jun 22 '22 03:06 vonericsen

@vonericsen , it is possible to parse the file yeah, but it is not guaranteed to exist. It is optional, though exists by default. What might be sensible to do is to figure out whether a given drive has a ZFS label, which would look something like this:

root@bsr-595529b8:~# zdb -l /dev/rdsk/c2t1d0s0
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'p01'
    state: 0
    txg: 107675
    pool_guid: 3642064540761792299
    errata: 0
    hostid: 1211643264
    hostname: 'bsr-595529b8'
    top_guid: 12419761296629501954
    guid: 2040682562661304656
    vdev_children: 1
    vdev_tree:
        type: 'mirror'
        id: 0
        guid: 12419761296629501954
        metaslab_array: 68
        metaslab_shift: 29
        ashift: 9
        asize: 10724048896
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 2040682562661304656
            path: '/dev/dsk/c2t1d0s0'
            devid: 'id1,sd@n6000c29ca094599ccc159b2508fcfe08/a'
            phys_path: '/pci@0,0/pci15ad,1976@10/sd@1,0:a'
            whole_disk: 1
            DTL: 885
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 910833938474671540
            path: '/dev/dsk/c2t2d0s0'
            devid: 'id1,sd@n6000c2974567955b1151ff4984e0970b/a'
            phys_path: '/pci@0,0/pci15ad,1976@10/sd@2,0:a'
            whole_disk: 1
            DTL: 884
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1 2 3

Then from this information, maybe it is a matter of asking whether the given pool is imported. Or, maybe not even that far, and just stopping at having a label. If the label is present, I think it is safe to say that this drive is a member of a pool, maybe active, maybe offline. At which point it is a matter of checking whether this pool is imported, and that can be done by inspecting /etc/mnttab, perhaps.

szaydel avatar Jun 22 '22 12:06 szaydel

I will look into the label as well to see how that detection could be added! Thanks for this idea too!

vonericsen avatar Jun 22 '22 14:06 vonericsen