bookkeeper icon indicating copy to clipboard operation
bookkeeper copied to clipboard

Fix bugs in hash bucket algorithm implementation in collection classes

Open lhotari opened this issue 2 years ago • 1 comments

Descriptions of the changes in this PR:

Motivation

  • Fix a bug in ConcurrentLongLongPairHashMap, ConcurrentLongPairSet and ConcurrentOpenHashMap where the hash bucket's storage index was incorrectly calculated.
    • the impact of the bug has been that the implementations have been rather unoptimal because of of frequent collisions in storing the item to the underlying array. Essentially the hash bucket algorithm hasn't been properly used.
  • Fix a bug in signSafeMod function which is duplicated in all classes. The buggy implementation didn't return correct results for negative input values.

Changes

  • rename bucket to bucketIndex to indicate that it's the bucket index in the storage array that needs to be calculated.
  • calculate bucketIndex by multiplying the hash bucket by the item size (4 or 2).
  • fix the signSafeMod function which wasn't sign safe at all. Negative input values produced invalid results. Currently the logic is duplicated in all classes.

Additional context

See Pulsar bugs https://github.com/apache/pulsar/issues/21106 and https://github.com/apache/pulsar/issues/21108 in similar classes that are duplicated in the Pulsar code base and the fix PR https://github.com/apache/pulsar/issues/21107.

lhotari avatar Sep 01 '23 09:09 lhotari

btw. It might be the case that the previous code is fine. The problem with the old code is that it is cryptic. In the original code, signSafeMod function's name is misleading. In the cases where the item takes 2 or 4 elements, there's a bitwise shift to left to multiply by 2 (<< 1) or 4 (<< 2), directly in the signSafeMod function. It's bad from maintainability perspective to have functions with names that don't match the implementation. I doubt that using bitwise operations to optimize the speed of execution is worthwhile in Java code. It should be left to the compiler to do such optimizations.

lhotari avatar Sep 01 '23 15:09 lhotari

If this is really a bug then the PR needs a test

It's not a bug, just bad method naming in the original code. I'll close this.

lhotari avatar Apr 30 '24 18:04 lhotari