GTDBTk
GTDBTk copied to clipboard
Using classify_wf with low completeness genomes
I have a couple hundreds of single amplified genomes from single cell assemblies which have low completeness (completeness <50%). I saw that it is not recommended to use classify_wf for genomes with low completeness as “results can be impacted by a lack of marker genes or contamination”. However, my project goal is to study phage-host association in single cells so I want to capture as much host diversity as possible. I’m wondering if there is a way to get reliable taxonomic classifications for genomes with low completeness; for example, is it possible to include all low completeness genomes in the initial classify_wf process and then manually curate the reliable classifications based on presence/absence of marker genes? (for contamination part, I will only use genomes with < 10% contamination). Thanks for the help!