`history_event-zfs-list-cacher.sh` not skipping receives on existing dataset
System information
| Type | Version/Name |
|---|---|
| Distribution Name | Debian |
| Distribution Version | 12 / Bookworm |
| Kernel Version | 6.1.0-37-amd64 (/6.1.140-1) |
| Architecture | x86_64 |
| OpenZFS Version | 2.2.5 (from backports with a hold) |
Describe the problem you're observing
When send & receiving (a lot) of datasets to update snapshots there are (a lot) of invocations to history_event-zfs-list-cacher.sh leading to (a lot) of zfs list ... invocations as started by the script. But to me it seems like these are not needed. The purpose of the zed script is to rebuild the filesystem cache file. And this script already tries hard to minimize the number of times it does something, by only continuing on certain event types, and when possible also filtering by the info of the events (i.e. if it's set or inherit it checks the related property whether it's one which is used in the cache file). Furthermore it also rules out any events on snapshots, based on ZEVENT_HISTORY_DSNAME having a @ i.e. snapshot separator in its value.
But my observation is that on receiving snapshots (to an existing dataset) it does not contain a snapshot name in history_dstname. Thus leading to the full "cacher" script being executed (thus including the zfs list ... call which seems to be rather slow / CPU heavy on my system).
As an example this is a finish receiving event I'm seeing which the "cacher" would act on:
Jun 10 2025 19:05:51.578452914 sysevent.fs.zfs.history_event
version = 0x0
class = "sysevent.fs.zfs.history_event"
pool = "tank"
pool_guid = 0x22115a8ac87b45e2
pool_state = 0x0
pool_context = 0x0
history_hostname = "server"
history_dsname = "tank/backup/local/rpool/home/root/%recv"
history_internal_str = "snap=pyznap_2025-06-10_18:45:03_frequent"
history_internal_name = "finish receiving"
history_dsid = 0xbd9d
history_txg = 0x16f1886
history_time = 0x684865ef
time = 0x684865ef 0x227a7db2
eid = 0xd61e8
My proposal would thus be to stop execution of the "cacher" script when history_dsname (possibly combined with history_internal_str?) lead to the conclusion of being a send to an existing dataset.
Describe how to reproduce the problem
- Open a process monitor and optionally filter on "zfs list"
- (Side by side) send a snapshot to an existing destination location
Expected result is that the process monitor does not show any executions of "zfs list", thus skipping the regeneration of the cache file.
Include any warning/errors/backtraces from the system logs
-
Hmm. At he moment I'm wondering whether my proposal (/and with it this issue) is correct. I presume the received snapshot could contain changed properties? Which thus would require the cache to be regenerated.
The Docker ZFS storage driver also triggers multiple serial invocations of this script when you pull new images or clean images up. I have a suspicion that each layer change is queuing events and each event is getting processed serially. Even if the suggestions in the original post can't be implemented perhaps there is a way to have this script coalesce events when a flurry of changes happen?
I concur that Docker with zfs storage driver can lead to excessive runs of this zedlet. Since the script locks the output file, only one instance can be running zfs list at a time. Which means that it can take a long time to go through all the events.
I ran docker image prune which reclamed measly 16GB of space, but generated 860 sysevent.fs.zfs.history_event entries over about 1 minute. After that zfs list takes 6-10 seconds to produce the output, so it should expect it to go on for 1.5-2.5 hours, overwriting the cache file with identical content 859 times.
I also run sanoid to schedule snaphots, which generates on the order of 30 events once an hour, leading to similar, but shorter, spikes.
This is especially concerning because the machine where all this is happening is my laptop, and zfs list keeping one of the CPU cores constantly busy eats up a substantial chunk of battery charge.