RefineM
RefineM copied to clipboard
Inconsistent Nomenclatures of the organisms in r95 and r80 databases
Dear Parks, Thanks for providing this awesome RefineM for improving MAGs quality! When using r95 database to assgin taxonomy by protein searching against r95 (r95.faa and r95.taxonomy) and by 16S searching against r80 (r80.ssu and r80.taxonomy), we found that the same organism had inconsistent nomenclatures. It seems like r95 is in the GTDB style while r80 is in the NCBI style? Maybe I shoud not use the different versions of protein and 16S databases? This may be a stupid question :). Thanks for your patience! Yuan
Hi. Both files are using the GTDB Taxonomy, but different versions (R80 vs R95). There have been a number of improvements to the GTDB since R80, including organizing genomes into ANI-based species clusters. This has resulted in a number of reclassifications. At the moment, we don't have a systematic way to create updated 16S databases for RefineM which is why the older version is available. Use of this database should certainly be done with care. I appreciate this is far from an ideal situation, but reflects the limited resources we have available at the moment.