zfs Cache dbuf_hash() calculation

Motivation and Context

We currently compute a 64-bit hash three times, which consumes 0.8% CPU time on ARC eviction heavy workloads.

This was done by Klara Systems and sponsored by Wasabi Technology, Inc.

Description

We cache the 64-bit hash.

How Has This Been Tested?

It has been build tested and runtime tested, a couple months ago on an older version of the code.

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[x] Performance enhancement (non-breaking change which improves efficiency)
[ ] Code cleanup (non-breaking change which makes code smaller or more readable)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
[ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
[ ] Documentation (a change to man pages or other documentation)

Checklist:

[x] My code follows the OpenZFS code style requirements.
[ ] I have updated the documentation accordingly.
[x] I have read the contributing document.
[ ] I have added tests to cover my changes.
[ ] I have run the ZFS Test Suite with this change applied.
[x] All commit messages are properly formatted and contain Signed-off-by.

Dec 02 '22 19:12 ryao

I have a small worry, which is that this may be intentional (or 67% intentional...) to catch memory corruption in-place. But probably not.

Dec 02 '22 19:12 adamdmoss

I have a small worry, which is that this may be intentional (or 67% intentional...) to catch memory corruption in-place. But probably not.

Recalculating the hash value for dbuf_create after dbuf_find already calculated it and failed to find it would not help to catch corruption in place.

Recalculating it following a bit flip when removing the dbuf would cause us to fail to remove the dbuf.

I do not believe that the recalculation hardens us against bad memory.

Dec 02 '22 20:12 ryao

I plan to revise this some time on Wednesday.

Dec 05 '22 20:12 ryao

Sorry for the delay. I was not feeling that great last week. Anyway, I addressed all of the comments. I just need to hear back from @ahrens on whether naming the pointer values hash_in and hash_out to designate the meaning is alright.

Dec 12 '22 19:12 ryao

Sorry for the delay. I was not feeling that great last week. Anyway, I addressed all of the comments. I just need to hear back from @ahrens on whether naming the pointer values hash_in and hash_out to designate the meaning is alright.

On second thought, pulling out the hash calculation from dbuf_create() so we can unconditionally pass by value instead of optionally by pointer is better. I just did that and repushed.

Dec 12 '22 19:12 ryao

I did another push to fix a cstyle.pl complaint that I had missed. The change is purely a formatting fix.

Dec 13 '22 03:12 ryao