mag icon indicating copy to clipboard operation
mag copied to clipboard

Host removal on assembly contigs

Open prototaxites opened this issue 1 year ago • 1 comments

Description of feature

When running the pipeline with host removal, I often find a reasonable proportion of the resulting bins still get classified by CAT as the host (in my case, Quercus robur), suggesting that some proportion of host reads in the fastq files are not being removed. This might depend on the overall quality of the host genome assembly provided, as well as specific bowtie2 tuning parameters that can be optionally set.

It might be useful to add an optional host removal step post-assembly, to find and remove contigs aligning to the host genome, to capture reads that passed the first host filter. The output of this could then be passed to the binning stage as required.

prototaxites avatar May 09 '23 12:05 prototaxites