zfs icon indicating copy to clipboard operation
zfs copied to clipboard

Unable to mount root filsystem located on an encrypted pool after upgrade to zfs 2.2.2 -> "blkptr at <ADDRESS> has invalid TYPE 95"

Open EvTheFuture opened this issue 2 years ago • 5 comments

System information

Type Version/Name
Distribution Name Alpine Linux
Distribution Version 3.19
Kernel Version 6.6.5
Architecture x86_64
OpenZFS Version 2.2.2

Describe the problem you're observing

After upgrading to Alpine Linux 3.19 (from 3.18) during the boot process a PANIC occur with the following message: 22.1279031 PANIC: rpool: blkptr at ffffb35f54c13c00 has invalid TYPE 95 (see attached image).

After reverting back to Alpine 3.18, it boots and the boot process completes successfully and everything on the encrypted pool is accessible.

image

Comparison between Alpine 3.18 and Alpine 3.19:

Type Alpine 3.18 Alpine 3.19
ZFS 2.1.14 2.2.2
Kernel 6.1.66 6.6.5

Description of the setup: Laptop with one SSD consisting of three partitions (1, 2, and 3)

Partition 1: EFI System Partition

Partition 2: ZFS Pool This partition contains the dataset used for /boot

Partition 3: Encrypted ZFS Pool This partition contain an encrypted pool (rpool) with /root /home /var etc as datasets

Describe how to reproduce the problem

Have encrypted root and upgrade from Alpine Linux 3.18 to 3.19.

Include any warning/errors/backtraces from the system logs

Unable to access any logs due to PANIC...

EvTheFuture avatar Dec 12 '23 04:12 EvTheFuture

The headline might be a bit wrong since it seems like it actually mounts the root filesystem but fails shortly after.

EvTheFuture avatar Dec 12 '23 06:12 EvTheFuture

Update:

By building the kernel package (Linux kernel 6.1.69) and the package for the zfs kernel module (from zfs 2.1.14) from Alpine Linux 3.18, it works as expected, which means it completes the boot process using the root fs from the encrypted zfs root pool without the error message PANIC: rpool: blkptr at ffffb35f54c13c00 has invalid TYPE 95.

The following packages are now installed on Alpine Linux 3.19

$ apk list -I | grep -E "zfs|linux-lts-6"
linux-lts-6.1.69-r0 x86_64 {linux-lts} (GPL-2.0-only) [installed]
zfs-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-bash-completion-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-libs-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-lts-6.1.69-r1 x86_64 {zfs-lts} (CDDL-1.0) [installed]
zfs-openrc-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-udev-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]

EvTheFuture avatar Dec 29 '23 14:12 EvTheFuture

To resolve this issue i did the following:

Booted up Alpine Linux with ZFS 2.1.14 and ran zpool scrub -w which resulted in: "No known data errors" image

I then booted Alpine Linux with ZFS 2.2.2 from an USB drive and ran zpool scrub -w which resulted in: "Permanent error have been detected in the following files..." image

I then rebooted into Alpine Linux with ZFS 2.1.14 again and re-ran zpool scrub -w which again resulted in "No known data errors"

I created a new dataset on the same pool and copied the data from the dataset ZFS 2.2.2 detects as having permanent errors.

when the data was successfully copied I changed the mount point from the old dataset to the new one and rebooted Alpine Linux with ZFS 2.2.2 and this time it booted as expected.

Is there some new checks that detected a the "permanent error" correctly when using ZFS 2.2.2 or might it be a bug?

EvTheFuture avatar Jan 11 '24 03:01 EvTheFuture

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 26 '25 19:04 stale[bot]

This may have the same root cause as #16626 . I'm currently testing a patch for that bug. The patch will prevent pools from getting corrupted in the first place, but it won't fix the corruption that you already have. You must rewrite the offending file.

asomers avatar May 22 '25 20:05 asomers