Unable to mount root filsystem located on an encrypted pool after upgrade to zfs 2.2.2 -> "blkptr at <ADDRESS> has invalid TYPE 95"
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Alpine Linux |
| Distribution Version | 3.19 |
| Kernel Version | 6.6.5 |
| Architecture | x86_64 |
| OpenZFS Version | 2.2.2 |
Describe the problem you're observing
After upgrading to Alpine Linux 3.19 (from 3.18) during the boot process a PANIC occur with the following message:
22.1279031 PANIC: rpool: blkptr at ffffb35f54c13c00 has invalid TYPE 95 (see attached image).
After reverting back to Alpine 3.18, it boots and the boot process completes successfully and everything on the encrypted pool is accessible.
Comparison between Alpine 3.18 and Alpine 3.19:
| Type | Alpine 3.18 | Alpine 3.19 |
|---|---|---|
| ZFS | 2.1.14 | 2.2.2 |
| Kernel | 6.1.66 | 6.6.5 |
Description of the setup: Laptop with one SSD consisting of three partitions (1, 2, and 3)
Partition 1: EFI System Partition
Partition 2: ZFS Pool This partition contains the dataset used for /boot
Partition 3: Encrypted ZFS Pool This partition contain an encrypted pool (rpool) with /root /home /var etc as datasets
Describe how to reproduce the problem
Have encrypted root and upgrade from Alpine Linux 3.18 to 3.19.
Include any warning/errors/backtraces from the system logs
Unable to access any logs due to PANIC...
The headline might be a bit wrong since it seems like it actually mounts the root filesystem but fails shortly after.
Update:
By building the kernel package (Linux kernel 6.1.69) and the package for the zfs kernel module (from zfs 2.1.14) from Alpine Linux 3.18, it works as expected, which means it completes the boot process using the root fs from the encrypted zfs root pool without the error message PANIC: rpool: blkptr at ffffb35f54c13c00 has invalid TYPE 95.
The following packages are now installed on Alpine Linux 3.19
$ apk list -I | grep -E "zfs|linux-lts-6"
linux-lts-6.1.69-r0 x86_64 {linux-lts} (GPL-2.0-only) [installed]
zfs-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-bash-completion-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-libs-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-lts-6.1.69-r1 x86_64 {zfs-lts} (CDDL-1.0) [installed]
zfs-openrc-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
zfs-udev-2.2.2-r0 x86_64 {zfs} (CDDL-1.0) [installed]
To resolve this issue i did the following:
Booted up Alpine Linux with ZFS 2.1.14 and ran zpool scrub -w which resulted in: "No known data errors"
I then booted Alpine Linux with ZFS 2.2.2 from an USB drive and ran zpool scrub -w which resulted in: "Permanent error have been detected in the following files..."
I then rebooted into Alpine Linux with ZFS 2.1.14 again and re-ran zpool scrub -w which again resulted in "No known data errors"
I created a new dataset on the same pool and copied the data from the dataset ZFS 2.2.2 detects as having permanent errors.
when the data was successfully copied I changed the mount point from the old dataset to the new one and rebooted Alpine Linux with ZFS 2.2.2 and this time it booted as expected.
Is there some new checks that detected a the "permanent error" correctly when using ZFS 2.2.2 or might it be a bug?
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
This may have the same root cause as #16626 . I'm currently testing a patch for that bug. The patch will prevent pools from getting corrupted in the first place, but it won't fix the corruption that you already have. You must rewrite the offending file.