using BinDash-rs for MinHash
Hi @wwood,
I implemented BinDash 1/2 (the theoretically guaranteed, 1000 times faster than Mash, 10-100 faster than Dashing) in Rust (https://github.com/jianshu93/bindash-rs), the original paper for bindash 2 is here: https://www.biorxiv.org/content/10.1101/2024.03.13.584875v1.abstract. I did not make it modular yet but should be fine if a list of genomes are provided. Let me know if this can be an addition to galah for pre-clustering via Minhash-like ones.
Best, Jianshu
Sounds nice - does it have a conda? Might be easiest just to use the command line interface?
How does it compare to skani?
Hi Ben, I think this is for fast initial clustering and skani cannot do that. It is way more faster than any ANI calculator since it is MinHash alone. Combined it with FastANI, we can have both fast and accurate genome clustering. Jianshu