Peter Menzel
Peter Menzel
Just extract the various tgz files that you downloaded into separate folders. Note that you can run kaiju with only one DB at a time for a input fastq file....
When you want to make a custom database for kaiju, you need to follow the format described here: https://github.com/bioinformatics-centre/kaiju#custom-database The fasta-headers need to contain NCBI taxonomy IDs, that are also...
Check that you have enough RAM and CPUs available. See the table in the README for RAM requirements.
1+2: You probably get a lot of false positive matches. Try reducing the E-value threshold and/or the required score in Greedy mode. 3: I don't understand what you mean exactly....
> How would I get detailed taxonomy information for reads that have partial classification? You could try kaiju2krona as kaiju2table requires to set a rank and everything classified above that...
> @pmenzel Can you write me a brief overview of how `kaiju2table` works? So essentially it reads in the results file, counts the frequency of each NCBI Taxa_ID, then maps...
@LeeBergstrand I added option `-l` for specifying the ranks shown in the output to `kaiju2krona` in the latest commit.
When using this option, there might be lines with identical taxon paths in the output, depending on the chosen ranks. So it might need some post-processing depending on the downstream...
That's more or less what kaiju2table and kaiju2krona are doing. Yes column 3 is the one with the taxon id. Just be aware that taxon ids change over time, so...
Check out what your "other NCBI taxa IDs" are. Note how kaiju computes the LCA of taxa with equally good matches.