zfs Scrub recent data

Motivation and Context

When a disk goes offline and is later brought back online with zpool clear, the user may also want to trigger a scrub of the recently written data to ensure there is no corruption.

For user convenience, we provide a "recent" scrub option in zpool scrub.

Description

We use scn_max_txg and scn_min_txg, which are already part of the scrub mechanism, to implement this feature.

Since we (developers) don’t want to expose a configuration that specifies "the number of TXGs" (as discussed in #16301 ), we instead use a time-based definition. Users can set the time using one of three units (seconds, hours, or days) for convenience. A TXG database is used to map these time ranges to valid TXG numbers.

Using time instead of TXG introduces an interesting problem: if the pool has been offline for days or weeks, and the user runs zpool clear -s, it may do nothing, since no recent data has been written. To handle this case, we decided to take the last known TXG from the database and use that as the starting point for scrubbing.

How Has This Been Tested?

New tests have been added to the this feature.

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Performance enhancement (non-breaking change which improves efficiency)
[ ] Code cleanup (non-breaking change which makes code smaller or more readable)
[ ] Quality assurance (non-breaking change which makes the code more robust against bugs)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
[ ] Documentation (a change to man pages or other documentation)

Checklist:

[x] My code follows the OpenZFS code style requirements.
[x] I have updated the documentation accordingly.
[x] I have read the contributing document.
[x] I have added tests to cover my changes.
[ ] I have run the ZFS Test Suite with this change applied.
[x] All commit messages are properly formatted and contain Signed-off-by.

Sep 18 '25 10:09 oshogbo

Last push rebasing to master to get fresh test results.

Oct 20 '25 21:10 robn

I am not sure about this change motivation, and respectively adequate solution choice. If some device got lost, but it was redundant and did not cause pool suspension, then ZFS should already record TXGs when the device was offline and not properly updated to start resilver for that period when zpool clear restores it. If the pool was suspended, then I don't think it should be measured in terms of time, but instead in terms of last one or several(?) TXGs that could possibly be affected by the device cache loss. And I'd think it should not be optional, but done on any pool resume of that kind. Though if some writes were lost and the pool does not have redundancy, I am not sure how much good can this scrub make, other than notify that the pool may be (fatally) corrupted.

Oct 22 '25 17:10 amotin

I am not sure about this change motivation, and respectively adequate solution choice. If some device got lost, but it was redundant and did not cause pool suspension, then ZFS should already record TXGs when the device was offline and not properly updated to start resilver for that period when zpool clear restores it. If the pool was suspended, then I don't think it should be measured in terms of time, but instead in terms of last one or several(?) TXGs that could possibly be affected by the device cache loss. And I'd think it should not be optional, but done on any pool resume of that kind. Though if some writes were lost and the pool does not have redundancy, I am not sure how much good can this scrub make, other than notify that the pool may be (fatally) corrupted.

I think Mav has a point here that maybe for the zpool clear case, this new feature should not be optional. The reason the (maybe should have been a separate PR) zpool scrub -R takes a time, instead of number of TXGs, was requests from Brian to make it use time based instead.

Nov 19 '25 19:11 allanjude