Jianshu_Zhao

Results 231 comments of Jianshu_Zhao
trafficstars

The most important one for MinHash is #4 mentioned above, which break the complexity from O(dk) to O(d+k), d is nonzero elements in the set while k is the number...

Dear Andrii, It seems the recent breakthrough in densification method for one permutation Minhash with densification has reached the theoretical limit. In additional to the optimal densification method mentioned above,...

Hello Aaron, Thanks for the response. I was asking because the ape package in R (https://search.r-project.org/CRAN/refmans/ape/html/cophenetic.phylo.html) is very slow when calculating phylogenetic distance among taxa like 16S amplicon (there are...

Hello All, it is also my concern when using the default kraken2 database, I would be very interested to know how this will affect the final results. And many thanks...

I am also curious about this. should be working with minor changes of parameters Jianshu

Hi @LucaCappelletti94, Thanks for letting me know! Did you benchmarked against the MLE and improved estimator in Ertl's paper, as implemented in SourMash (https://docs.rs/sourmash/0.15.0/sourmash/sketch/hyperloglog/estimators/index.html)? This is a direct import of...

And also this one, hypertwobits(https://github.com/axiomhq/hypertwobits/tree/main), a very new one, paper was published just last month. I am a little bit concerned about it since the benchmarks showed very large variation...

I would also be interested in UltraLogLog (https://dl.acm.org/doi/abs/10.14778/3654621.3654632) and ExaLogLog (https://arxiv.org/abs/2402.13726). But no implementation is available in Rust. All Ertl's implementation is in the hash4j java library, which I do...

Hello Team, I am also wondering whether distance like hamming distance (normalized) can be provided between 2 vectors, in addition to L2 norm. Thanks, Jianshu

I will do that before you figure it out. I was asking because in the FastANI paper, mash distance could only approximate ANI/FastANI when sketch size is larger than 10^4...