zfs icon indicating copy to clipboard operation
zfs copied to clipboard

zpool not importable and ASSERT at zdb.c:3908:load_concrete_ms_allocatable_trees()

Open sneumann opened this issue 1 year ago • 3 comments

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 20.04
Kernel Version 5.4.0-132-generic #148-Ubuntu SMP Mon Oct 17 16:02:06 UTC 2022
Architecture x86_64
OpenZFS Version zfs-0.8.3-1ubuntu12.14, zfs-kmod-0.8.3-1ubuntu12.14

Describe the problem you're observing

I have a zpool with raidz on three 5T Seagate disks, running on Ubuntu 20.04 happily for ~2years which can't be imported anymore.

SMART looks unsuspicious for all three devices, e.g.

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%        58         -

The troubling pool shows up in

zpool import -d /dev/disk/by-id
   pool: three
     id: 7585041214268964297
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

	three                       ONLINE
	  raidz1-0                  ONLINE
	    wwn-0x5000c500d3e2552b  ONLINE
	    wwn-0x5000c500d3f965c2  ONLINE
	    wwn-0x5000c500d3f92a29  ONLINE

but it can not be imported:

zpool import three
cannot import 'three': I/O error
	Destroy and re-create the pool from
	a backup source.

The ZDB looks ok (no expert here)

zdb  -e three

Configuration for import:
        vdev_children: 1
        version: 5000
        pool_guid: 7585041214268964297
        name: 'three'
        state: 0
        hostid: 2677716145
        hostname: 'kaestchen'
        vdev_tree:
            type: 'root'
            id: 0
            guid: 7585041214268964297
            children[0]:
                type: 'raidz'
                id: 0
                guid: 5612826130591374815
                nparity: 1
                metaslab_array: 69
                metaslab_shift: 34
                ashift: 9
                asize: 15002900103168
                is_log: 0
                create_txg: 4
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 4245064270772150798
                    whole_disk: 1
                    DTL: 373
                    create_txg: 4
                    path: '/dev/sda1'
                    devid: 'ata-ST5000LM000-2AN170_WCJ3V9AZ-part1'
                    phys_path: 'pci-0000:00:1f.2-ata-2'
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 3847351160452222273
                    whole_disk: 1
                    DTL: 391
                    create_txg: 4
                    path: '/dev/sdb1'
                    devid: 'ata-ST5000LM000-2AN170_WCJ3VQCA-part1'
                    phys_path: 'pci-0000:00:1f.2-ata-3'
                children[2]:
                    type: 'disk'
                    id: 2
                    guid: 8390181302815008501
                    whole_disk: 1
                    DTL: 390
                    create_txg: 4
                    path: '/dev/sdc1'
                    devid: 'ata-ST5000LM000-2AN170_WCJ3VV39-part1'
                    phys_path: 'pci-0000:00:1f.2-ata-4'
        load-policy:
            load-request-txg: 18446744073709551615
            load-rewind-policy: 2
zdb: can't open 'three': File exists

ZFS_DBGMSG(zdb) START:
ZFS_DBGMSG(zdb) END

but checking the datasets shows something is off:

zdb -d three
Dataset mos [META], ID 0, cr_txg 4, 174M, 965 objects
failed to own dataset 'three/video': Input/output error          <<<============
Dataset three/www [ZPL], ID 146, cr_txg 505, 967M, 15841 objects
Dataset three/EMUmirror [ZPL], ID 626, cr_txg 52188, 8.71G, 480 objects
Dataset three/home [ZPL], ID 140, cr_txg 179, 231G, 116891 objects
Dataset three/urbackup [ZPL], ID 134, cr_txg 99, 211G, 30484298 objects
Dataset three/mayan [ZPL], ID 962, cr_txg 765918, 345M, 2466 objects
Dataset three/bionic [ZPL], ID 956, cr_txg 124432, 15.6G, 3146 objects
Dataset three/photos [ZPL], ID 14, cr_txg 87, 71.9G, 17318 objects
Dataset three [ZPL], ID 54, cr_txg 1, 106M, 47 objects
MOS object 18 (DSL directory) leaked                                <<<=========
MOS object 19 (DSL props) leaked                                    <<<=========
MOS object 20 (DSL directory child map) leaked                      <<<=========
MOS object 21 (zap) leaked                                          <<<=========
MOS object 22 (DSL dataset snap map) leaked                         <<<=========
MOS object 23 (DSL deadlist map) leaked                             <<<=========
Verified large_blocks feature refcount of 0 is correct
Verified large_dnode feature refcount of 0 is correct
Verified sha512 feature refcount of 0 is correct
Verified skein feature refcount of 0 is correct
Verified edonr feature refcount of 0 is correct
userobj_accounting feature refcount mismatch: 8 datasets != 9 refcount   <<=====
Verified encryption feature refcount of 0 is correct
project_quota feature refcount mismatch: 8 datasets != 9 refcount   <<<=========

I run checksumming, which triggert the ASSERTION in the issue title:

zdb -mc three

Metaslabs:
	vdev          0   
	metaslabs   873   offset                spacemap          free        
	---------------   -------------------   ---------------   ------------
	metaslab      0   offset            0   spacemap     78   free    9.25G
space map object 78:
  smp_length = 0x49a68
  smp_alloc = 0x1afc81800
	metaslab      1   offset    400000000   spacemap     77   free    14.5G
space map object 77:
  smp_length = 0x1082f8
  smp_alloc = 0x5e023c00
	metaslab      2   offset    800000000   spacemap     76   free    10.6G
[...]
space map object 952:
  smp_length = 0x53b0
  smp_alloc = 0xe000
	metaslab    872   offset  da000000000   spacemap    951   free    16.0G
space map object 951:
  smp_length = 0x530
  smp_alloc = 0x20800

Traversing all blocks to verify metadata checksums and verify nothing leaked ...

loading concrete vdev 0, metaslab 172 of 873 ...space_map_load(msp->ms_sm, msp->ms_allocatable, maptype) == 0 (0x5 == 0)
ASSERT at zdb.c:3908:load_concrete_ms_allocatable_trees()

I have also tried unsuccessfully to import the pool inside an Openmediavault VM that has all three disks passed in, with zfs-0.8.5-pve1, zfs-kmod-0.8.5-pve1.

zdb history ends with the last successful import:

zdb -h three | tail -4 
...
2022-11-12.19:08:19 zpool import -c /etc/zfs/zpool.cache -aN
2022-11-13.18:12:50 zpool import -c /etc/zfs/zpool.cache -aN
2022-11-16.20:11:03 zpool import -c /etc/zfs/zpool.cache -aN

=> Any ideas how to recover the pool ?

Yours, Steffen

sneumann avatar Nov 17 '22 22:11 sneumann

Forgot to include some dmesg snippets:

...
[  246.746417] WARNING: can't open objset 21, error 5
[  348.241978] WARNING: can't open objset 21, error 5
[  533.579410] WARNING: can't open objset 21, error 5

I tried to find the code with the assertion in the 0.8.3 tag, but couldn't get the exact line. Only thing nearby I see is

https://github.com/openzfs/zfs/blob/9bb3d57b03e6916a2d38574420a2934b8827b3fb/cmd/zdb/zdb.c#L3883

This assertion is also mentioned in https://github.com/openzfs/zfs/issues/7672 but I don't recall doing any snapshots, I am sure dedup is not turned on. I also did not send the pool. I have not mixed kernel/userland, since I am on distribution packages.

In https://github.com/openzfs/zfs/issues/8192 the same assertion is triggered in zdb. I also don't have any encryption. My pool was created IIRC under 0.8.x , but I haven't knowingly touched checkpoints. Should I try ? If yes, what exactly ?

I could try to import the pool booting into an ubuntu 22.04, but I don't know if that has the potential to make things worse.

Yours, Steffen

sneumann avatar Nov 18 '22 09:11 sneumann

Another update: the pool also will not import if booting an Ubuntu 22.04 with the same error symptoms. If that helps anyone, I also dumped the ZDB and there are all the Uberblocks.

zdb -ul -e three
------------------------------------
LABEL 0
------------------------------------
    version: 5000
    name: 'three'
    state: 0
    txg: 940018
    pool_guid: 7585041214268964297
    errata: 0
    hostid: 2677716145
    hostname: 'kaestchen'
    top_guid: 5612826130591374815
    guid: 8390181302815008501
    vdev_children: 1
    vdev_tree:
        type: 'raidz'
        id: 0
        guid: 5612826130591374815
        nparity: 1
        metaslab_array: 69
        metaslab_shift: 34
        ashift: 9
        asize: 15002900103168
        is_log: 0
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 4245064270772150798
            path: '/dev/sda1'
            whole_disk: 1
            DTL: 373
            create_txg: 4
        children[1]:
            type: 'disk'
            id: 1
            guid: 3847351160452222273
            path: '/dev/sdb1'
            whole_disk: 1
            DTL: 391
            create_txg: 4
        children[2]:
            type: 'disk'
            id: 2
            guid: 8390181302815008501
            path: '/dev/sdc1'
            whole_disk: 1
            DTL: 390
            create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
    labels = 0 1 2 3 
    Uberblock[0]
	magic = 0000000000bab10c
	version = 5000
	txg = 939904
	guid_sum = 11233720005190169068
	timestamp = 1668637792 UTC = Wed Nov 16 23:29:52 2022
	mmp_magic = 00000000a11cea11
	mmp_delay = 0
	mmp_valid = 0
	checkpoint_txg = 0
        labels = 0 1 2 3 
    Uberblock[1]
	magic = 0000000000bab10c
	version = 5000
	txg = 939905
	guid_sum = 11233720005190169068
	timestamp = 1668637457 UTC = Wed Nov 16 23:24:17 2022
	mmp_magic = 00000000a11cea11
	mmp_delay = 0
	mmp_valid = 0
	checkpoint_txg = 0
        labels = 0 1 2 3 
...
    Uberblock[126]
	magic = 0000000000bab10c
	version = 5000
	txg = 939902
	guid_sum = 11233720005190169068
	timestamp = 1668637449 UTC = Wed Nov 16 23:24:09 2022
	mmp_magic = 00000000a11cea11
	mmp_delay = 0
	mmp_valid = 0
	checkpoint_txg = 0
        labels = 0 1 2 3 
    Uberblock[127]
	magic = 0000000000bab10c
	version = 5000
	txg = 939903
	guid_sum = 11233720005190169068
	timestamp = 1668637450 UTC = Wed Nov 16 23:24:10 2022
	mmp_magic = 00000000a11cea11
	mmp_delay = 0
	mmp_valid = 0
	checkpoint_txg = 0
        labels = 0 1 2 3 

Yours, Steffen

sneumann avatar Nov 21 '22 08:11 sneumann

Hi, with some help I got further, here's what happened. These two completed without error:

zdb -m -e -U /dev/null three
zdb -mm -e -U /dev/null three

while this one failed:

zdb -mmm -e -U /dev/null three
[...]
space map object 211:
  smp_length = 0x9be0
  smp_alloc = 0xf565cc00
	metaslab    172   offset  2b000000000   spacemap    286   free    11.2G
metaslab_load(msp) == 0 (0x5 == 0)
ASSERT at zdb.c:1656:dump_metaslab()Aborted (core dumped)

Importing the pool without mounting any filesystems also failled:

zpool import -f  -o readonly=on -N three
cannot import 'three': I/O error

Checking out /proc/spl/kstat/zfs/dbgmsg after the failed import got

label discarded as txg is too large (940018 > 940017)

so I got the suggestion to use -T 940017 picking a transaction group to start with:

echo 0 > /sys/module/zfs/parameters/spa_load_verify_data 
echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata 
zpool import -f -o readonly=on -T 940017 -N three

The spa_load_verify_* are needed to avoid the really slow (few MB/s) verification of all the data.

Next will be to recover the data from the individual file systems.

sneumann avatar Dec 02 '22 20:12 sneumann