sourmash icon indicating copy to clipboard operation
sourmash copied to clipboard

searching database for any duplicates genomes

Open SAMtoBAM opened this issue 3 months ago • 3 comments

Hi there

I have received genomes from numerous sources, some previously public, some not but I don't know which So I have a set of genomes and want to see if any of them are identical to the larger complete public set of genomes

First, do you think sourmash be a suitable and fast option to determine this? Second, would there be an appropriately stringent kmer and scale options for building the signature databases?

Thanks a lot

SAMtoBAM avatar Mar 21 '24 19:03 SAMtoBAM