diamond icon indicating copy to clipboard operation
diamond copied to clipboard

own Fasta for makedb with taxonomy

Open Sxxnwei opened this issue 3 years ago • 4 comments

Hi,

I used Diamond before by downloading all bacteria and archaea nonredundant protein sequences from RefSeq for makedb, but now with current version, I found that there is no taxonomy output anymore in the BLAST tabular format.

I looked up the online manual here and this is not clear for me with the makedb options --taxonmap , --taxonnodes , or --taxonnames .

I would like to know if I can use the my own Fasta for taxonmap, taxnnodes or taxnnames? or this needs to some specific files?

Thanks, Sean

Sxxnwei avatar Aug 03 '22 11:08 Sxxnwei

You can use your own files but they need to have the same format as the ncbi files.

bbuchfink avatar Aug 03 '22 11:08 bbuchfink

Thanks for the prompt reply.

What is the format of NCBI files? The ftp links, such as ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.FULL.gz ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdmp.zip I cannot open.

thanks, Sean

Sxxnwei avatar Aug 03 '22 12:08 Sxxnwei

These links work fine. You can try using wget on the command line or try to use https:// instead of ftp://.

bbuchfink avatar Aug 05 '22 09:08 bbuchfink

I only can download the files via wget, and successfully got the NCBI tax ID in the output based on database made from the RefSeq non-redundant proteins.

thanks again! Sean

Sxxnwei avatar Aug 08 '22 08:08 Sxxnwei