gatk icon indicating copy to clipboard operation
gatk copied to clipboard

multi-sample calling support in M2 WDL

Open byoo opened this issue 6 years ago • 1 comments

Feature request

Tool(s) or class(es) involved

mutect2.wdl and related scripts

Description

It would be useful to have multi-sample calling support (#5560) in M2 WDL.

byoo avatar Jun 27 '19 19:06 byoo

Hi @davidbenjamin ,

this ticket is already open for a while, but I haven't found a sensible multi-sample wdl yet, so I've spent some time putting a multi-sample calling workflow together and tested it extensively: https://github.com/phylyc/gatk4-somatic-snvs-indels

It's somewhat optimized for resource needs, at least much more so than the published gatk mutect2 wdl. One option that I added is to run the realignment filter only on a subset of called variants, which is especially helpful for tumor-only calling to reduce costs if we are hard filtering variants afterwards based on some thresholds anyways (who doesn't?). One todo is still to choose the best normal sample for the contamination model based on which has the highest sequencing depth, as you had mentioned in some old gatk forum post. Happy to have some contributions there :) I also brushed up the PoN wdl, which is also tested.

How important is cram input support? I took that part out, but it's easy to plug it back in, I suppose.

phylyc avatar Apr 29 '22 22:04 phylyc