Mash
Mash copied to clipboard
Update RefSeq?
Hi, I'm using Mash to detect contamination in de-novo genome assemblies, together with other tools that work on the latest release of the RefSeq database. Is it possible to build a sketch file for the genomes in the latest release using a PC with 16Gb RAM?
If it is, could you share the workflow necessary to do it? If it is not, is someone willing to do the work and share the file?
Any help will be greatly appreciated
yes.is the refseq.genomes.k21.s1000.msh is the latest version ?
No, it is quite old. I would advise to create a new sketch. NCBI RefSeq now has 330,648 genome reference assemblies while the sketch has 91,282. Sometimes I hit deprecated accession numbers that are removed from new metadata assembly_file_manifest.txt