GTDBTk icon indicating copy to clipboard operation
GTDBTk copied to clipboard

Using classify_wf with low completeness genomes

Open lingrongjin opened this issue 6 months ago • 1 comments

I have a couple hundreds of single amplified genomes from single cell assemblies which have low completeness (completeness <50%). I saw that it is not recommended to use classify_wf for genomes with low completeness as “results can be impacted by a lack of marker genes or contamination”. However, my project goal is to study phage-host association in single cells so I want to capture as much host diversity as possible. I’m wondering if there is a way to get reliable taxonomic classifications for genomes with low completeness; for example, is it possible to include all low completeness genomes in the initial classify_wf process and then manually curate the reliable classifications based on presence/absence of marker genes? (for contamination part, I will only use genomes with < 10% contamination). Thanks for the help!

lingrongjin avatar Aug 12 '24 07:08 lingrongjin