Krona icon indicating copy to clipboard operation
Krona copied to clipboard

Krona chart only shows 100% bacteria

Open ghost opened this issue 7 years ago • 7 comments

Hello all,

I have a blastn file conatining 16s blastn results, and almost half of them are classified by Kona as "unassigned" while it's clear they should be assigned. Now, that file contains several different samples, say samples from "site1" "site2", etc ... and I wanted to focus on each specific sample. So I extracted the blast output for only "site1", for example, and now, Krona only shows a full circle "100% bacteria" with no other taxonomic assignment. However the blast result have well-defined taxonomic assignments. I did the 2 updates of Accession and Taxonomu

I am not sure to understand Krona behaviour, what am I missing? Thanks a lot

ghost avatar May 09 '17 21:05 ghost

Hi @aderzelle, The Krona BLAST classification algorithm is based on LCA (Lowest Common Ancestor), so if your reads have good hits to a diversity of taxa, that would explain the lack of resolution in the output. You can control what constitutes "good" hits with -t, with -t 0 meaning that only exact bit score ties will defer to LCA, which may be appropriate given that 16s hits will be relatively short and similar.

ondovb avatar May 09 '17 22:05 ondovb

thanks a lot, setting -t 0 did not help but ... the hits in the blast output all have e-values of 0.0 and at least 99% identity if this can help.

ghost avatar May 09 '17 23:05 ghost

It's actually the bit scores that matter rather than e-values, at least by default in recent versions. I'm guessing there are exact ties. You can see what is tying by adding -r to distribute randomly among the ties, but these results shouldn't be considered as real classifications.

ondovb avatar May 10 '17 00:05 ondovb

Sorry I should have mentioned I indeed tried the -r option, without change. The bitscores are not all strictly identical but highly similar and yes, there are exact ties. I guess then, Krona doesn't give an answer because there is no clear answer to be given from the blast results.

Thanks a lot for your prompt answer and helpful comments.

ghost avatar May 10 '17 00:05 ghost

That makes sense, although it's still odd that -r made no difference...it may be worth it to check out some of the tax IDs that are tied to see if the lowest common ancestor is really Bacteria, or else there could be a database issue. If you'd like to post a few lines of the Blast output here I can look into it.

ondovb avatar May 10 '17 00:05 ondovb

Sure, I could even send the whole file, it's no big deal. I have randomly picked up some of the tax IDs and they are all bacteria indeed, ranging from "uncultured bacteria"

`LYSAT_2_16S KX509289.1 99.857 1399 1 1 1 1398 1447 49 0.0 2571

LYSAT_2_16S JQ977227.1 99.571 1398 6 0 1 1398 1398 1 0.0 2549

LYSAT_2_16S KR233780.1 99.500 1401 4 3 1 1398 1437 37 0.0 2545

LYSAT_2_16S KX036607.1 99.499 1398 7 0 1 1398 1405 8 0.0 2543

LYSAT_2_16S KT949415.1 99.500 1399 6 1 1 1398 1432 34 0.0 2543

LYSAT_2_16S CP014517.1 99.500 1399 6 1 1 1398 422991 424389 0.0 2543

`

ghost avatar May 10 '17 00:05 ghost

Yes, and in this case, at least, the "uncultured bacteria" hit happens to be the best one, which is why changing the threshold didn't help. I'm not sure what to do about this, other than blasting to a cleaner DB, e.g. Refseq Genomic maybe. In the future, the classifier could try to ignore these taxonomic oddities. You could also try feeding your blast results into a more comprehensive classifier, like MEGAN, and then using ktImportTaxonomy to visualize its output.

ondovb avatar May 10 '17 19:05 ondovb