charcoal
charcoal copied to clipboard
Remove contaminated contigs from genomes using k-mers and taxonomies.
Could be nice if it was organized like: `report/index.html`, `report/index.ipynb`, `report/stage2.html`, `report/stage2.ipynb` `report/alignments/*.align.html`, `report/alignments/*.align.ipynb`, `report/taxonomy/*.fig.html`, `report/taxonomy/*.fig.ipynb` That way, with large collections of genomes, the summary reports will be immediately available,...
In text files `stage2/*postprocess.txt`, there are sections that postprocess mashmap alignments, parsing them to determine percent identity for each contig against each contaminant genome. ex: ``` removing 9kb with 5kb...
https://www.biorxiv.org/content/10.1101/2022.01.24.477489v1
When running charcoal to clean contigs, I noticed: ``` python -m charcoal run inputs/charcoal-conf.yml all_clean_contigs ``` ``` filter rank is none; not doing any cleaning. wrote 2171793 clean bp to...
With the merge of https://github.com/sourmash-bio/sourmash/pull/1543 and https://github.com/sourmash-bio/sourmash/pull/1651, among others, we have a lot of ~duplicated functionality between charcoal and sourmash. We should change charcoal over to using sourmash functions where...
This PR updates snakemake to 6.10.0. There are a few other misc changes - - limit Python to < 3.10 for now - add a print statement that makes genome...
https://www.nature.com/articles/ismej2015100
Hello charcoal developers, I am trying to use charcoal to decontaminate illumina assembly on eukaryotic genome. Charcoal needs me to provide **Eukaryotic lineages** (I tried using tara-delmont-provided-lineages.csv with this message...
(this is kind of a cross-repo issue, but I'm putting it here for now because code is evolving faster in charcoal than in sourmash :) it would be valuable to...