charcoal icon indicating copy to clipboard operation
charcoal copied to clipboard

notes for documentation - bigger databases => better?, impact of lateral gene transfer/phage

Open ctb opened this issue 5 years ago • 2 comments

in theory, as we sequence more and more microbial genomes, charcoal should become better and better (balanced a bit by database size and the potential need to dereplicate through species clusters)

it's not clear to me that Reason 2 is a great idea based on challenges of lateral gene transfer and phage. I guess at the least it will highlight places people should check their genomes?

ctb avatar May 23 '20 14:05 ctb

although note that reason 2 and 3 look at majority lineage, so the entire contig has to be questionable. hmm.

ctb avatar May 23 '20 14:05 ctb

ah interesting note about majority lineage. This would/should still cause problems with plasmids and with small contigs that are dominated by phage or HGT.

I like the idea of saying "check your genomes." I sort of view the *dirty.fa.gz file as either 1) clear contaminants, or 2) contigs that need curation by the user and clear evidence to be re-added to the genome.

taylorreiter avatar May 26 '20 15:05 taylorreiter