zfs ZFS panic in ddt_object_remove(): VERIFY(...) failed during pool operation and boot import on FreeBSD 14.1 with dedup=on

System information

Type	Version/Name
Distribution Name	FreeBSD
Distribution Version	14.1-RELEASE-p7
Kernel Version	14.1-RELEASE-p7
Architecture	amd64
OpenZFS Version	zfs-2.2.4-FreeBSD_g256659204, zfs-kmod-2.2.4-FreeBSD_g256659204

Describe the problem you're observing

We are experiencing critical panics on two separate FreeBSD 14.1 systems using OpenZFS 2.2.4. The crashes occur both:

During normal runtime
And again during boot, likely when the system attempts to auto-import the pool

The panic message on both systems is identical:

panic: VERIFY(ddt_object_remove(ddt, otype, oclass, dde, tx) == 0) failed

All affected systems are using dedup=on. Once the crash occurs, the systems become unbootable without intervention (boot loop due to panic during pool import).

Additionally, similar symptoms are now emerging on other servers configured similarly.

Describe how to reproduce the problem

The bug is currently only reproducible on the affected systems. General steps:

System crashes with kernel panic in ddt_object_remove()
After reboot, system crashes again during early boot when importing the poo

How was the pool originally created?
zpool create zroot raidz /dev/ada0p3 /dev/ada1p3 /dev/ada2p3 /dev/ada3p3

Include any warning/errors/backtraces from the system logs

Boot log excerpt:

Starting file system checks:
panic: VERIFY(ddt_object_remove(ddt, otype, oclass, dde, tx) == 0) failed

Backtrace (identical across both servers):

panic: VERIFY(ddt_object_remove(...) == 0) failed
ddt_sync()
dsl_scan_sync()
spa_sync()
txg_sync_thread()
...
Stopped at kdb_enter+0x33: movq $0,0xa20612(%rip)

zpool import -o readonly=on -o failmode=continue -N -f zroot gives read-only access without panic.

Additional notes

Systems affected: 2 (identical issue), early signs on additional hosts
All use dedup=on
zpool status showed no degraded or failed devices
Full logs unavailable due to immediate panic, but full screenshots are attached

Attachments

4 screenshots showing:

Crash during runtime on both servers
Crash during boot on both servers

Jul 10 '25 16:07 davidlinden02

Is there anything common between those two systems/pools? I suppose they are not clones of each-other?

Jul 11 '25 02:07 amotin

Is there anything common between those two systems/pools? I suppose they are not clones of each-other?

Both systems have the same configuration. They are not clones of each other.

Jul 11 '25 11:07 davidlinden02

I wonder what error is returned there by ddt_object_remove(). Wish instead of VERIFY(... == 0) there was VERIFY0(...), so that it would report. If it is some EIO, then it could be some corruption, but that is unlikely for two unrelated systems. If there is ENOENT, then I wonder if it is possible to have something deduped and then removed in the same TXG. I need to look deeper into the code, but I wonder, do you use block cloning on those systems, in case it may be a factor? Any idea what you might delete/overwrite when it happens?

PS: Looking closer, my guess about creation and deletion in the same TXG might be wrong. If the entry was not read from disk, then it won't have type different from DDT_TYPES, and so ddt_object_remove() won't be called. But if it was read from disk, then it is there and we should be able to delete it. Odd. I don't see how could it happen outside of metadata I/O error.

Jul 11 '25 15:07 amotin

@robn It seems DDT code could be more careful about ZAP read errors handling. In both the old version and the new versions ddt_lookup() ignores ZAP read errors, which may result in something less recoverable later during table sync. We could probably just disable dedup for new writes and leak space for frees if we can't read the ZAP. Though it does not explain how can it happen on two separate systems.

Jul 11 '25 17:07 amotin

@amotin I'm attaching a screenshot with import on FreeBSD current + openzfs 2.3 (FreeBSD version). We don't use explicit dataset clones (zfs clone), but the system actively uses snapshots. In response to the question "Any idea what you might be deleting or overwriting when it happens?": unfortunately, I don't know - many users have access to the system.

Jul 12 '25 18:07 davidlinden02

97 means ECKSUM, so it really can't read some record from DDT ZAP. We definitely should somehow improve error handling there, but it does not explain how we got into this situation. If you are able to still somehow import the pool, have you run a scrub on it? Does it report some errors too?

Jul 12 '25 20:07 amotin

@amotin On FreeBSD 15, the system hits a kernel panic, while on version 14.2 it just hangs, it's been stuck for 10 hours now. Even though a scrub usually finishes on this server in around 3 hours.

Jul 13 '25 12:07 davidlinden02

Scrub does not work on read-only imported pool. See https://github.com/openzfs/zfs/issues/14481 and https://github.com/openzfs/zfs/issues/17527 .

Jul 14 '25 22:07 amotin

It seems the same panic happened to me a few months ago while I was using FreeBSD 14.1. Recreating the affected dataset helped, and the system is still usable without issues. I'm thinking about what to do if it appears again. Is there a way to mark files "as broken" (just to show them in zpool status -v) instead of panicking?

Aug 01 '25 17:08 avkarenow

I’ve started to experience panic more frequently now (on multiple installations...), so I’ve been using the following workarounds:

For ddt_object_update == 97: It gives me a few seconds before panic to disable dedup after the import (zfs set dedup=off dataset). After disabling dedup, the import becomes possible.

For ddt_object_remove: Removing VERIFY0() from VERIFY0(ddt_object_remove(ddt, otype, oclass, ddk, tx)); allows the import to succeed again.

In both cases, the scrub doesn’t find any errors and pool works without issues.

I'm thinking about the best way to handle these cases:

Disable DDT updates on checksum errors?
Force remove the DDT object even if the checksum is incorrect?

Any ideas would be much appreciated.

Dec 05 '25 16:12 avkarenow