kraken2 icon indicating copy to clipboard operation
kraken2 copied to clipboard

Custom database with GTDB taxonomy

Open lingrongjin opened this issue 2 months ago • 6 comments

I want to build a custom database with my own genomes (MAGs) with GTDB taxonomy. My understanding of the workflow for creating custom databases is that we need to 1) install a taxonomy (--download-taxonomy) 2) install one of the reference libraries (--download-library) and 3) we can add other genomes by using --add-to-library. However, I have a few questions about the workflow: 1. I saw from #884 that the GTDB database is published recently, is there a way to add my own genomes to the prebuilt GTDB database? Do we need to use --download-taxonomy/--download-library or do we just need to do --add-to-library --db GTDB? 2. What is the format requirement for the sequences of my own genomes? Does it need to be formatted as ">sequence16|kraken:taxid|32630 Adapter sequence" as suggested on the manual?

Thanks for your help!

lingrongjin avatar Dec 19 '24 03:12 lingrongjin