gatk
gatk copied to clipboard
java.lang.IllegalStateException in JointGermlineCNVSegmentation
Bug Report
Affected tool(s) or class(es)
JointGermlineCNVSegmentation
Affected version(s)
- [x] Latest public release version [v4.3.0.0]
- [ ] Latest master branch as of [date of test?]
Description
I get the following exception when running JointGermlineCNVSegmentation on an exome trio dataset:
[January 19, 2023 at 6:59:29 AM CET] org.broadinstitute.hellbender.tools.walkers.sv.JointGermlineCNVSegmentation done. Elapsed time: 0.82 minutes.
Runtime.totalMemory()=300941312
java.lang.IllegalStateException: Encountered genotype with ploidy 0 but 1 alleles.
at org.broadinstitute.hellbender.utils.Utils.validate(Utils.java:814)
at org.broadinstitute.hellbender.tools.walkers.sv.JointGermlineCNVSegmentation.correctGenotypePloidy(JointGermlineCNVSegmentation.java:701)
at org.broadinstitute.hellbender.tools.walkers.sv.JointGermlineCNVSegmentation.prepareGenotype(JointGermlineCNVSegmentation.java:682)
at org.broadinstitute.hellbender.tools.walkers.sv.JointGermlineCNVSegmentation.lambda$createDepthOnlyFromGCNVWithOriginalGenotypes$4(JointGermlineCNVSegmentation.java:666)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at java.base/java.util.ArrayList$Itr.forEachRemaining(ArrayList.java:1033)
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
at org.broadinstitute.hellbender.tools.walkers.sv.JointGermlineCNVSegmentation.createDepthOnlyFromGCNVWithOriginalGenotypes(JointGermlineCNVSegmentation.java:667)
at org.broadinstitute.hellbender.tools.walkers.sv.JointGermlineCNVSegmentation.apply(JointGermlineCNVSegmentation.java:280)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.apply(MultiVariantWalkerGroupedOnStart.java:133)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.afterTraverse(MultiVariantWalkerGroupedOnStart.java:193)
at org.broadinstitute.hellbender.engine.MultiVariantWalkerGroupedOnStart.traverse(MultiVariantWalkerGroupedOnStart.java:166)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1095)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Steps to reproduce
gatk JointGermlineCNVSegmentation --reference hs37d5.fa --variant index.vcf.gz --variant father.vcf.gz --variant mother.vcf.gz --model-call-intervals gcnv_preprocess_intervals.Agilent_SureSelect_Human_All_Exon_V6.interval_list --pedigree family.ped --output out.vcf.gz
The input VCF lines look as follows:
## index
Y 2654827 CNV_Y_2654827_24461230 N . 3076.53 . END=24461230 GT:CN:NP:QA:QS:QSE:QSS .:0:220:94:3077:472:1358
## father
Y 2654827 CNV_Y_2654827_24461230 N . 3076.53 . END=24461230 GT:CN:NP:QA:QS:QSE:QSS 0:1:220:58:3077:105:376
## mother
Y 2654827 CNV_Y_2654827_24461230 N <DEL> 3076.53 . END=24461230 GT:CN:NP:QA:QS:QSE:QSS 1:0:220:29:3077:357:640
The call looks like an artifact in the BAM alignments. However, the contig ploidy for the mother looks ... interesting.
## index (sex assigned at birth: female)
CONTIG PLOIDY PLOIDY_GQ
X 2 123.51003746478007
Y 0 9.176757618621913
## father (sex assigned at birth: male)
CONTIG PLOIDY PLOIDY_GQ
X 1 123.5100374633715
Y 1 17.498503426830368
## mother (sex assigned at birth: female)
CONTIG PLOIDY PLOIDY_GQ
X 2 123.51003745758246
Y 1 0.09888866060944837
The sample of the mother has a slightly increased fraction of chrY reads when compared to other female samples but is far below the fraction of chrY reads that male samples have that were sequenced with the same kit.
There is an increase in the variance of alternate allele balance for het. sites in this sample as well. I assume that this sample has been contaminated with male DNA.
Expected behavior
I would like to be able to deactivate the hard error on the command line and replace it with a warning in the output logs.
Actual behavior
There is a hard crash that cannot be circumvented.