gatk
gatk copied to clipboard
ReblockGVCF fails by an exception:No shortest ALT at 4645646543 across alleles: [*]
Bug Report
Affected tool(s) or class(es)
GATK ReblockGVCF
Affected version(s)
- 4.2.5.0 and 4.2.6.1
Description
I am running ReblockGVCF on GVCF's that are haplotyped on version 4.0.1.4. About 1 out of 500 samples crash with the following error:
ReblockGVCF fails by an exception:No shortest ALT at 464564654 across alleles: [*].
Complete error message:
org.broadinstitute.hellbender.exceptions.GATKException: Exception thrown at chr4::464564654[VC /bug.g.vcf.gz @
redacted
] filters=
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:145)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.traverse(MultiVariantWalker.java:136)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1085)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: org.broadinstitute.hellbender.exceptions.GATKException: No shortest ALT at 464564654 across alleles: [*]
at org.broadinstitute.hellbender.tools.walkers.variantutils.ReblockGVCF.addRefBlockIfNecessary(ReblockGVCF.java:632)
at org.broadinstitute.hellbender.tools.walkers.variantutils.ReblockGVCF.cleanUpHighQualityVariant(ReblockGVCF.java:596)
at org.broadinstitute.hellbender.tools.walkers.variantutils.ReblockGVCF.regenotypeVC(ReblockGVCF.java:347)
at org.broadinstitute.hellbender.tools.walkers.variantutils.ReblockGVCF.apply(ReblockGVCF.java:273)
at org.broadinstitute.hellbender.engine.MultiVariantWalker.lambda$traverse$1(MultiVariantWalker.java:139)
... 20 more
Steps to reproduce
gatk ReblockGVCF -R/hs38DH.fa -V bug.g.vcf.gz -O bug.rb.vcf.gz
(I generated a minimal example to reproduce the problem, but I am not sure I am allowed to publish this data in public, I can send it over, it's only 21KB)
Expected behaviour
A complete reblocked GVCF file.
Actual behavior
GATK crashed
@maarten-k Can you please check whether there's a <NON_REF> allele present at the locus it's complaining about (464564654), in addition to the * allele?
Also, could you try re-generating your GVCFs with a more recent version of HaplotypeCaller? 4.0.1.4 is quite old at this point...
Yes, there is 43 bases in front of this position a C,<NON_REF> where the REF is about 270 basepairs long.
Also, could you try re-generating your GVCFs with a more recent version of HaplotypeCaller? 4.0.1.4 is quite old at this point...
Yes, I know this is an old version, but I am at the end of finalising a 15.000+ WGS callset. So switching is not an easy solution for me. However, I will test also this with the newest version.
I can confirm this is not the case anymore with GATK 4.2.6.1. Minor correction from my side: The GATK version should be 4.1.4.0 where the issue occurred.
Can you advise for a workaround on this? I can remove the problematic lines from the files with some basic command line tools, but if there is a more sophisticated way, please let me know.