guava icon indicating copy to clipboard operation
guava copied to clipboard

Make BloomFilter.bitSize() public

Open MartinHaeusler opened this issue 1 year ago • 2 comments

1. What are you trying to do?

I am using Guava's bloom filters as part of a persistent file format, i.e. the raw byte array of the bloom filter lives somewhere in the file. It would be beneficial to the efficiency to know the length of the byte array produced by the bloom filter beforehand (i.e. without actually serializing it).

There is already a method called bitSize in the BloomFilter, but unfortunately it is not public. The method also doesn't include the two bytes from the strategy and the number of hash functions, as well as the integer for the length of the bits.data array.

2. What's the best code you can write to accomplish that without the new feature?

public int getByteSizeOf(BloomFilter<*> bloomFilter) {
    return serialize(bloomFilter).length;
}

public byte[] serialize(BloomFilter<*> bloomFilter){
    try(var baos = new ByteArrayOutputStream()) {
        bloomFilter.writeTo(baos);
        return baos.toByteArray();
    }
}

The method getSizeOf is very inefficient because it actually serializes the bloom filter to get its size. It would be nice if we could do it without the serialization.

3. What would that same code look like if we added your feature?

BloomFilter<*> bloom = ...;
var size = bloom.getSizeInBytes();

(Optional) What would the method signatures for your feature look like?

public class BloomFilter<T> {

    public int getSizeInBytes();

}

Concrete Use Cases

Serialization of the bloom filter as a building block for more complex formats.

Packages

com.google.common.hash

Checklist

MartinHaeusler avatar Dec 09 '23 20:12 MartinHaeusler

Exposed the method as public for both Android & JRE flavours

ssrijan avatar Aug 31 '24 13:08 ssrijan

Hi @MartinHaeusler can i work on this issue?

HKMANOJ avatar Nov 17 '24 12:11 HKMANOJ