Struo2 icon indicating copy to clipboard operation
Struo2 copied to clipboard

missing species in GTDB 207 gte50comp-lt5cont.nwk tree

Open chloelulu opened this issue 5 months ago • 2 comments

Hello Developer,

I am currently using the database available at http://ftp.tue.mpg.de/ebio/projects/struo2/GTDB_release207/kraken2/ for my analysis with Kraken2, and I've found it quite straightforward to use. According to the classification results, my samples show a significant abundance of Acetatifactor sp003612485 and 1XD42-69 sp003612565. Consequently, I plan to use the phylogenetic tree from http://ftp.tue.mpg.de/ebio/projects/struo2/GTDB_release207/phylogeny/gte50comp-lt5cont.nwk as a reference as well.

However, I noticed that these two species are not present in the tree tips. May I know why they are missing and how I can obtain a complete tree that includes all the species listed in names.dmp?

Here are the commands I used to search for the species:

grep -E 'Acetatifactor sp003612485|1XD42-69 sp003612565' taxonomy/names.dmp
406383408	|	1XD42-69 sp003612565	|		|	scientific name	|
611307305	|	Acetatifactor sp003612485	|		|	scientific name	|
grep -E 'Acetatifactor sp003612485|1XD42-69 sp003612565' gte50comp-lt5cont.nwk

Your suggestions are much appreciated!

chloelulu avatar Sep 03 '24 20:09 chloelulu