gatk
gatk copied to clipboard
Reference isn't being parsed/detected correctly when using CalibrateDragstrModel in parallel mode
trafficstars
Instructions
The github issue tracker is for bug reports, feature requests, and API documentation requests. General questions about how to use the GATK, how to interpret the output, etc. should be asked on the official support forum.
- Search the existing github issues to see if your issue (or something similar) has already been reported. If the issue already exists, you may comment there to inquire about the progress.
- Determine whether your issue is a bug report, a feature request, or a documentation request (for tool/class javadoc only -- for forum docs please post there)
- Consider if your "issue" is better addressed on the GATK forum: http://gatkforums.broadinstitute.org/gatk Post there if you have questions about expected tool behavior, output format, unexpected results, or generally any question that does not fit into the categories above
- Use a concise yet descriptive title
- Choose the corresponding template block below and fill it in, replacing or deleting text in italics (surrounded by
_) as appropriate - Delete the other template blocks and this header.
Bug Report
Affected tool(s) or class(es)
GATK CalibrateDragstrModel
Affected version(s)
- [x] Latest public release version [4.3.0.0]
- [ ] Latest master branch as of [date of test?]
Description
When running CalibrateDragstrModel in parallel mode, the supplied reference isn't detected correctly causing the following error stack trace:
Using GATK jar /usr/local/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx72g -jar /usr/local/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar CalibrateDragstrModel --input input.cram --output input.txt --reference hg38.fa --str-table-path hg38.zip --threads 12 --intervals fasta_bed.bed --tmp-dir .
10:24:21.117 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
10:24:21.289 INFO CalibrateDragstrModel - ------------------------------------------------------------
10:24:21.289 INFO CalibrateDragstrModel - The Genome Analysis Toolkit (GATK) v4.3.0.0
10:24:21.289 INFO CalibrateDragstrModel - For support and documentation go to https://software.broadinstitute.org/gatk/
10:24:21.289 INFO CalibrateDragstrModel - Executing as nvnieuwk on Linux v4.18.0-372.36.1.el8_6.x86_64 amd64
10:24:21.289 INFO CalibrateDragstrModel - Java runtime: OpenJDK 64-Bit Server VM v11.0.15-internal+0-adhoc..src
10:24:21.289 INFO CalibrateDragstrModel - Start Date/Time: January 2, 2023 at 10:24:21 AM GMT
10:24:21.289 INFO CalibrateDragstrModel - ------------------------------------------------------------
10:24:21.289 INFO CalibrateDragstrModel - ------------------------------------------------------------
10:24:21.290 INFO CalibrateDragstrModel - HTSJDK Version: 3.0.1
10:24:21.290 INFO CalibrateDragstrModel - Picard Version: 2.27.5
10:24:21.290 INFO CalibrateDragstrModel - Built for Spark Version: 2.4.5
10:24:21.290 INFO CalibrateDragstrModel - HTSJDK Defaults.COMPRESSION_LEVEL : 2
10:24:21.290 INFO CalibrateDragstrModel - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
10:24:21.290 INFO CalibrateDragstrModel - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
10:24:21.290 INFO CalibrateDragstrModel - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
10:24:21.290 INFO CalibrateDragstrModel - Deflater: IntelDeflater
10:24:21.290 INFO CalibrateDragstrModel - Inflater: IntelInflater
10:24:21.291 INFO CalibrateDragstrModel - GCS max retries/reopens: 20
10:24:21.291 INFO CalibrateDragstrModel - Requester pays: disabled
10:24:21.291 INFO CalibrateDragstrModel - Initializing engine
10:24:21.937 INFO FeatureManager - Using codec BEDCodec to read file file:///kyukon/scratch/gent/vo/000/gvo00082/vsc44804/nxf.4BNO5qL7DM/fasta_bed.bed
10:24:21.962 INFO IntervalArgumentCollection - Processing 3217346917 bp from intervals
10:24:22.008 INFO CalibrateDragstrModel - Running in parallel using the requested number of threads: 12
10:24:22.008 INFO CalibrateDragstrModel - Done initializing engine
10:24:22.008 INFO ProgressMeter - Starting traversal
10:24:22.008 INFO ProgressMeter - Current Locus Elapsed Minutes Records Processed Records/Minute
10:24:32.859 INFO ProgressMeter - chr1:26000000 0.2 59038 326477.4
10:24:42.867 INFO ProgressMeter - chr1:83000000 0.3 184245 529998.1
10:24:52.965 INFO ProgressMeter - chr1:137193529 0.5 306766 594565.4
10:25:03.307 INFO ProgressMeter - chr1:193193529 0.7 428759 622924.6
10:25:13.318 INFO ProgressMeter - chr2:3237107 0.9 564835 660497.0
10:25:23.358 INFO ProgressMeter - chr2:57237107 1.0 681209 666219.1
10:25:33.392 INFO ProgressMeter - chr2:109237107 1.2 799610 672091.8
10:25:44.527 INFO ProgressMeter - chr2:177512416 1.4 930822 676805.6
10:25:54.821 INFO ProgressMeter - chr2:237512416 1.5 1069999 691712.8
10:26:04.863 INFO ProgressMeter - chr3:54999378 1.7 1197525 698570.8
10:26:09.642 INFO CalibrateDragstrModel - Shutting down engine
[January 2, 2023 at 10:26:09 AM GMT] org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel done. Elapsed time: 1.81 minutes.
Runtime.totalMemory()=47647293440
java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Requested start 8613 is beyond the sequence length HLA-DRB1*04:03:01
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:600)
at java.base/java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1006)
at org.broadinstitute.hellbender.utils.Utils.runInParallel(Utils.java:1479)
at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.collectCaseStatsParallel(CalibrateDragstrModel.java:551)
at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.traverse(CalibrateDragstrModel.java:202)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1095)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Requested start 8613 is beyond the sequence length HLA-DRB1*04:03:01
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:600)
at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:678)
at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:737)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:919)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:558)
at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.lambda$collectCaseStatsParallel$14(CalibrateDragstrModel.java:568)
at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1448)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: java.lang.IllegalArgumentException: Requested start 8613 is beyond the sequence length HLA-DRB1*04:03:01
at htsjdk.samtools.cram.ref.ReferenceSource.getReferenceBasesByRegion(ReferenceSource.java:207)
at htsjdk.samtools.cram.build.CRAMReferenceRegion.fetchReferenceBasesByRegion(CRAMReferenceRegion.java:169)
at htsjdk.samtools.cram.structure.Slice.normalizeCRAMRecords(Slice.java:502)
at htsjdk.samtools.cram.structure.Container.getSAMRecords(Container.java:322)
at htsjdk.samtools.CRAMIterator.nextContainer(CRAMIterator.java:112)
at htsjdk.samtools.CRAMIterator.hasNext(CRAMIterator.java:204)
at htsjdk.samtools.CRAMFileReader$CRAMIntervalIteratorBase.getNextRecord(CRAMFileReader.java:589)
at htsjdk.samtools.CRAMFileReader$CRAMIntervalIteratorBase.initializeIterator(CRAMFileReader.java:562)
at htsjdk.samtools.CRAMFileReader$CRAMIntervalIterator.<init>(CRAMFileReader.java:620)
at htsjdk.samtools.CRAMFileReader$CRAMIntervalIterator.<init>(CRAMFileReader.java:615)
at htsjdk.samtools.CRAMFileReader.query(CRAMFileReader.java:487)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:550)
at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.queryOverlapping(SamReader.java:417)
at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.loadNextIterator(SamReaderQueryingIterator.java:130)
at org.broadinstitute.hellbender.utils.iterators.SamReaderQueryingIterator.<init>(SamReaderQueryingIterator.java:69)
at org.broadinstitute.hellbender.engine.ReadsPathDataSource.prepareIteratorsForTraversal(ReadsPathDataSource.java:412)
at org.broadinstitute.hellbender.engine.ReadsPathDataSource.prepareIteratorsForTraversal(ReadsPathDataSource.java:389)
at org.broadinstitute.hellbender.engine.ReadsPathDataSource.query(ReadsPathDataSource.java:352)
at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.readStream(CalibrateDragstrModel.java:915)
at org.broadinstitute.hellbender.tools.dragstr.CalibrateDragstrModel.lambda$null$11(CalibrateDragstrModel.java:556)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at org.broadinstitute.hellbender.tools.dragstr.InterleavingListSpliterator.forEachRemaining(InterleavingListSpliterator.java:87)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
... 5 more
However it does work when running the tool single threaded with the exact same options.
Steps to reproduce
I've sadly been unable to create a reproducible example. I've only encountered this with non-public data which I can't share here. I'd be happy to run tests for you though.