kaiju icon indicating copy to clipboard operation
kaiju copied to clipboard

Unable to Find Taxon_ID of the Top-hit Species in Kaiju Output File Despite Being the Highest-Reads Species in Kaiju2table Output

Open SenyuanGu opened this issue 1 year ago • 4 comments

hi, I ran kaiju and kaiju2table. Why can't I find the taxon_id of the species with the highest number of reads in the output file of Kaiju, even though it was the top hit in the output of kaiju2table

SenyuanGu avatar Jun 29 '23 07:06 SenyuanGu

that should normally not be the case, did you run kaiju2table on rank species ? How about the other species?

pmenzel avatar Jun 29 '23 17:06 pmenzel

Yes, I ran kaiju2table at the species level. Other species can be found in the Kaiju output txt file. I am wondering if this result is due to the fact that I annotated the contigs using Kaiju?

SenyuanGu avatar Jun 30 '23 13:06 SenyuanGu

Which database did you use? Sometimes there is a mismatch between the taxonomy info in the source database and the names.dmp / nodes.dmp files.

You can find out the correct taxon ID by searching for the species name in names.dmp.

pmenzel avatar Jul 02 '23 20:07 pmenzel

Thank you for your help! It works!! The ID and species name are indeed mismatched. The txid for Fragilariopsis cylindrus is 635003, but the result obtained from running kaiju2table is 186039.

SenyuanGu avatar Jul 03 '23 02:07 SenyuanGu