Peter Menzel comments

Results 80 comments of


                                            Peter Menzel

how to use multiple databases and generate krona.html and bubbles.svg?

Just extract the various tgz files that you downloaded into separate folders. Note that you can run kaiju with only one DB at a time for a input fastq file....

Taxonomic Annotation from orfs

When you want to make a custom database for kaiju, you need to follow the format described here: https://github.com/bioinformatics-centre/kaiju#custom-database The fasta-headers need to contain NCBI taxonomy IDs, that are also...

770904 killed ./kaiju -z 64 -t nodes.dmp -f kaiju_db_nr.fmi -i -j -o kaiju_1.out

Check that you have enough RAM and CPUs available. See the table in the README for RAM requirements.

Custom db question

1+2: You probably get a lot of false positive matches. Try reducing the E-value threshold and/or the required score in Greedy mode. 3: I don't understand what you mean exactly....

kaiju2table -- Meaning of "cannot be assigned to a (non-viral) X"

> How would I get detailed taxonomy information for reads that have partial classification? You could try kaiju2krona as kaiju2table requires to set a rank and everything classified above that...

kaiju2table -- Meaning of "cannot be assigned to a (non-viral) X"

> @pmenzel Can you write me a brief overview of how `kaiju2table` works? So essentially it reads in the results file, counts the frequency of each NCBI Taxa_ID, then maps...

kaiju2table -- Meaning of "cannot be assigned to a (non-viral) X"

@LeeBergstrand I added option `-l` for specifying the ranks shown in the output to `kaiju2krona` in the latest commit.

kaiju2table -- Meaning of "cannot be assigned to a (non-viral) X"

When using this option, there might be lines with identical taxon paths in the output, depending on the chosen ranks. So it might need some post-processing depending on the downstream...

kaiju2table -- Meaning of "cannot be assigned to a (non-viral) X"

That's more or less what kaiju2table and kaiju2krona are doing. Yes column 3 is the one with the taxon id. Just be aware that taxon ids change over time, so...

validity (QC) of prediction files based on a custom reference database

Check out what your "other NCBI taxa IDs" are. Note how kaiju computes the LCA of taxa with equally good matches.