diamond
diamond copied to clipboard
blast tab output: stitle should not include the sseqid
Currently the 'stitle' option for blast tab output is also including the subject id. For proper output, the subject id should be trimmed off subject_name
$ blastx -query seq1.fna -db nr -outfmt '6 qseqid sseqid pident stitle'
seq1 gb|KRM34755.1| 90.000 hypothetical protein FC44_GL000800 [Lactobacillus intestinalis DSM 6629]
seq1 ref|WP_057808743.1| 90.000 hypothetical protein [Lactobacillus intestinalis]
seq1 ref|WP_135960259.1| 90.000 hypothetical protein [Lactobacillus intestinalis]
seq1 ref|WP_154881438.1| 90.000 hypothetical protein [Lactobacillaceae bacterium]
$ diamond blastx -q seq1.fna -d nr --outfmt 6 qseqid sseqid pident stitle
seq1 WP_154881438.1 90.0 WP_154881438.1 hypothetical protein [Lactobacillaceae bacterium]
seq1 WP_057808743.1 90.0 WP_057808743.1 hypothetical protein [Lactobacillus intestinalis]
seq1 WP_135960259.1 90.0 WP_135960259.1 hypothetical protein [Lactobacillus intestinalis] >TGY16829.1 hypothetical protein E5351_02205 [Lactobacillus intestinalis]
seq1 KRM34755.1 90.0 KRM34755.1 hypothetical protein FC44_GL000800 [Lactobacillus intestinalis DSM 6629]
Ok, I will provide an option to trim the title accordingly.
Awesome, I have been using diamond for years, and I just recently realized there was an stitle option (I had been using blastdbcmd to get the stitle, and merging into the diamond output). Thanks for the great tool : )
Should default behavior be to only show the stitle and not require a seperate flag/arguement?
If a user wants both the qseqid (a second time) and stitle they would use:
--outfmt 6 qseqid sseqid pident qseqid stitle
Yes, it probably should. But I don't like to break compatibility between versions, which is why adding an option came to mind.