Fix bugs in hash bucket algorithm implementation in collection classes
Descriptions of the changes in this PR:
Motivation
- Fix a bug in ConcurrentLongLongPairHashMap, ConcurrentLongPairSet and ConcurrentOpenHashMap where the hash bucket's storage index was incorrectly calculated.
- the impact of the bug has been that the implementations have been rather unoptimal because of of frequent collisions in storing the item to the underlying array. Essentially the hash bucket algorithm hasn't been properly used.
- Fix a bug in
signSafeModfunction which is duplicated in all classes. The buggy implementation didn't return correct results for negative input values.
Changes
- rename
buckettobucketIndexto indicate that it's the bucket index in the storage array that needs to be calculated. - calculate bucketIndex by multiplying the hash bucket by the item size (4 or 2).
- fix the
signSafeModfunction which wasn't sign safe at all. Negative input values produced invalid results. Currently the logic is duplicated in all classes.
Additional context
See Pulsar bugs https://github.com/apache/pulsar/issues/21106 and https://github.com/apache/pulsar/issues/21108 in similar classes that are duplicated in the Pulsar code base and the fix PR https://github.com/apache/pulsar/issues/21107.
btw. It might be the case that the previous code is fine. The problem with the old code is that it is cryptic. In the original code, signSafeMod function's name is misleading. In the cases where the item takes 2 or 4 elements, there's a bitwise shift to left to multiply by 2 (<< 1) or 4 (<< 2), directly in the signSafeMod function. It's bad from maintainability perspective to have functions with names that don't match the implementation. I doubt that using bitwise operations to optimize the speed of execution is worthwhile in Java code. It should be left to the compiler to do such optimizations.
If this is really a bug then the PR needs a test
It's not a bug, just bad method naming in the original code. I'll close this.