molenc icon indicating copy to clipboard operation
molenc copied to clipboard

dense encoding for UHD

Open UnixJunkie opened this issue 5 years ago • 2 comments

maybe a counted Bloom filter with k and N parameters

UnixJunkie avatar Jun 11 '20 07:06 UnixJunkie

Done, but needs to test to see if useful in practice. E.g.

molenc_dense --bloom 5,50 -n 3214 -i data/x_std_01.txt > test_bloom.csv

UnixJunkie avatar Jun 12 '20 07:06 UnixJunkie

investigate Bloom filter or MinHash or LSH for current milenial FPs

UnixJunkie avatar May 02 '24 02:05 UnixJunkie

does not outperform ECFP4 2048b in a regression benchmark

UnixJunkie avatar Jul 17 '24 08:07 UnixJunkie