dashing
dashing copied to clipboard
unique exact matches hll
Hi,...
i have a question about the unique exact matches, can I use the (./dashing hll) not for exact matches, I need to know the whole number of matches, not just the unique one. In my sensitivity is it important to check the whole number of matches, only the unique exact matches is not really useful in my experiment.
Thanks!
Cheers Ahmad
Hi Ahmad,
I'm happy to help, but I'm not quite sure exactly what you're looking for.
Are you looking for multiset similarity, where multiple instances of the same k-mer are counted multiple times? You can do this exactly with dashing <dist/sketch> --wj-exact [input files]
or inexactly, using a count-mi sketch, via dashing <cmd> --wj
. See the Streaming Weighted Jaccard portion of the usage.
On the other hand, you might be looking for exact k-mer counts/matches; in that case, you can replace the HLL with sorted hash sets via the --use-full-khash-sets
option.
Thanks for asking, and I'm happy to help further.
Best,
Daniel