mag icon indicating copy to clipboard operation
mag copied to clipboard

Add dereplication with dRep

Open erikrikarddaniel opened this issue 2 years ago • 5 comments
trafficstars

Description of feature

dRep takes a set of genomes, with CheckM data, and dereplicates them to produce a set of non-overlapping genomes at a specified ANI. As this is basically just pointing to certain MAGs as the representatives of clusters, the output could possibly be summarised by a column in the bin summary table: dRep. If the column says e.g. 95 and hence indicates the ANI, one could potentially run dRep multiple times. There is no existing nf-core module.

erikrikarddaniel avatar Mar 27 '23 11:03 erikrikarddaniel

I also know of: https://github.com/wwood/galah that might do something similar

jfy133 avatar Mar 27 '23 12:03 jfy133

I have a module for Galah that I wrote for a personal pipeline processing the output of mag, as well as a process that takes the busco_summary.tsv file and converts it to the format required by Galah to use the completeness/contamination information.

I had planned to add them to mag at some point, but haven't found time yet.

prototaxites avatar Mar 31 '23 13:03 prototaxites

:tada: awesome!

jfy133 avatar Apr 03 '23 08:04 jfy133

https://github.com/nf-core/modules/pull/3666

prototaxites avatar Jul 24 '23 20:07 prototaxites

Requires:

  • https://github.com/nf-core/modules/issues/5591
  • https://github.com/nf-core/modules/issues/5590

jfy133 avatar May 10 '24 14:05 jfy133