Cache dbuf_hash() calculation
Motivation and Context
We currently compute a 64-bit hash three times, which consumes 0.8% CPU time on ARC eviction heavy workloads.
This was done by Klara Systems and sponsored by Wasabi Technology, Inc.
Description
We cache the 64-bit hash.
How Has This Been Tested?
It has been build tested and runtime tested, a couple months ago on an older version of the code.
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Performance enhancement (non-breaking change which improves efficiency)
- [ ] Code cleanup (non-breaking change which makes code smaller or more readable)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
- [ ] Documentation (a change to man pages or other documentation)
Checklist:
- [x] My code follows the OpenZFS code style requirements.
- [ ] I have updated the documentation accordingly.
- [x] I have read the contributing document.
- [ ] I have added tests to cover my changes.
- [ ] I have run the ZFS Test Suite with this change applied.
- [x] All commit messages are properly formatted and contain
Signed-off-by.
I have a small worry, which is that this may be intentional (or 67% intentional...) to catch memory corruption in-place. But probably not.
I have a small worry, which is that this may be intentional (or 67% intentional...) to catch memory corruption in-place. But probably not.
Recalculating the hash value for dbuf_create after dbuf_find already calculated it and failed to find it would not help to catch corruption in place.
Recalculating it following a bit flip when removing the dbuf would cause us to fail to remove the dbuf.
I do not believe that the recalculation hardens us against bad memory.
I plan to revise this some time on Wednesday.
Sorry for the delay. I was not feeling that great last week. Anyway, I addressed all of the comments. I just need to hear back from @ahrens on whether naming the pointer values hash_in and hash_out to designate the meaning is alright.
Sorry for the delay. I was not feeling that great last week. Anyway, I addressed all of the comments. I just need to hear back from @ahrens on whether naming the pointer values
hash_inandhash_outto designate the meaning is alright.
On second thought, pulling out the hash calculation from dbuf_create() so we can unconditionally pass by value instead of optionally by pointer is better. I just did that and repushed.
I did another push to fix a cstyle.pl complaint that I had missed. The change is purely a formatting fix.