charcoal
charcoal copied to clipboard
Results of a full evaluation of GTDB 25k release89 genomes
per #121 and using #122, I ran a (DNA-focused) contamination evaluation on all 25k genomes from the fastani collection of genomes in release 89.
Only 302 genomes had any suspected contamination at all. I attach that list as a .csv.txt file.
note that I turned off LCA-style evaluation here, so the only reasons for contig removal are reason 1, gather-based.
The parameters are a bit too stringent, I think , so I'm working on that. But this is a first pass.
updated! only ~240 genomes with any cross-kingdom contamination.
so that's ...what... 1% of genomes with some detectable contamination.