guava
guava copied to clipboard
A suggestion of code improvement for BloomFilter.
API(s)
com.google.common.hash.BloomFilter::optimalNumOfBits(long n, double p)
&
com.google.common.hash.BloomFilter::optimalNumOfHashFunctions(long n, long m)
How do you want it to be improved?
1. use static caculated value of log(2) and squared log(2) :
private static final double LOG_TWO = Math.log(2);
private static final double SQUARED_LOG_TWO = Math.pow(LOG_TWO,2);
2. calculate optimalNumOfBits by static values:
static long optimalNumOfBits(long n, double p) {
if (p == 0) {
p = Double.MIN_VALUE;
}
return (long) (-n * Math.log(p) / SQUARED_LOG_TWO);
}
3. caculate optimalNumOfHashFunction by false positive rate(p) directly and using LOG_TWO :
static int optimalNumOfHashFunctions(double p){
return Math.max(1, (int) Math.round( - Math.log(p) / LOG_TWO));
}
Why do we need it to be improved?
- Slightly reduce the method's execution time.
- Refine the calculation of the optimalNumOfHashFunctions method to be more concise and efficient. The current code may lead people to mistakenly believe that the result is related to the number of samples(n) and the number of bits(m), when in fact it is only related to the false positive rate(p).
Example
as the given codes in “How do you want it to be improved”
Current Behavior
as the source codes in “How do you want it to be improved”
Desired Behavior
as the given codes in com.google.common.hash.BloomFilter::optimalNumOfBits(long n, double p) & com.google.common.hash.BloomFilter::optimalNumOfHashFunctions(long n, long m)
Concrete Use Cases
as "How do you want it to be improved"
Checklist
-
[X] I agree to follow the code of conduct.
-
[X] I have read and understood the contribution guidelines.
-
[X] I have read and understood Guava's philosophy, and I strongly believe that this proposal aligns with it.