gatk icon indicating copy to clipboard operation
gatk copied to clipboard

GenomicsDBImport Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Open hougeng opened this issue 2 years ago • 0 comments

I got 348 samples to analyse their variants. I have read several turorials about how to use gatk to get a population vcf. At the beginning , I tried to use CombineGVCFs to get the Gvcf and use SelectVariants to pick the snps out.

CombineGVCFs truns to a error "Exception in thread "main" java.lang.OutOfMemoryError" . then I chose to use GenomicsDBImport to do this job. It still doesn't work

First error is "read_one_line_fully && "Buffer did not have space to hold a line fully - increase buffer size" I add "--genomicsdb-vcf-buffer-size 16384000" , it causes different error "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space"

This is my command and work log. My java version is openjdk version "1.8.0_152-release" OpenJDK Runtime Environment (build 1.8.0_152-release-1056-b12)

GATK is very helpful in my research, and I really need some help to get it work.

gatk --java-options "-Xmx48g -Xms48G" GenomicsDBImport -V C1_sentieon_gvcf.gz .......... -V SCAU-106.gvcf.gz -V SCAU-107.gvcf.gz -V SCAU-108.gvcf.gz -V SCAU-128.gvcf.gz --genomicsdb-workspace-path my_database.chr01 -R IRGSP-1.0_genome.fasta --genomicsdb-vcf-buffer-size 16384000 --intervals chr01

11:48:08.245 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/ayu/anaconda3/share/gatk4-4.0.5.1-0/gatk-package-4.0.5.1-local.jar!/com/intel/gkl/native/libgkl_compression.so 11:48:09.327 INFO GenomicsDBImport - ------------------------------------------------------------ 11:48:09.327 INFO GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.0.5.1 11:48:09.327 INFO GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/ 11:48:09.327 INFO GenomicsDBImport - Executing as ayu@ayu on Linux v5.15.90.1-microsoft-standard-WSL2 amd64 11:48:09.327 INFO GenomicsDBImport - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_152-release-1056-b12 11:48:09.327 INFO GenomicsDBImport - Start Date/Time: November 26, 2023 11:48:08 AM CST 11:48:09.327 INFO GenomicsDBImport - ------------------------------------------------------------ 11:48:09.327 INFO GenomicsDBImport - ------------------------------------------------------------ 11:48:09.327 INFO GenomicsDBImport - HTSJDK Version: 2.15.1 11:48:09.327 INFO GenomicsDBImport - Picard Version: 2.18.2 11:48:09.327 INFO GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2 11:48:09.327 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false 11:48:09.327 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true 11:48:09.327 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false 11:48:09.327 INFO GenomicsDBImport - Deflater: IntelDeflater 11:48:09.327 INFO GenomicsDBImport - Inflater: IntelInflater 11:48:09.327 INFO GenomicsDBImport - GCS max retries/reopens: 20 11:48:09.327 INFO GenomicsDBImport - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes 11:48:09.328 INFO GenomicsDBImport - Initializing engine 11:48:14.819 INFO IntervalArgumentCollection - Processing 43270923 bp from intervals 11:48:14.846 INFO GenomicsDBImport - Done initializing engine Created workspace /mnt/g/ubuntushare/sequence/C271_sentieon_gvcf/my_database.chr01 11:48:14.919 INFO GenomicsDBImport - Vid Map JSON file will be written to my_database.chr01/vidmap.json 11:48:14.919 INFO GenomicsDBImport - Callset Map JSON file will be written to my_database.chr01/callset.json 11:48:14.919 INFO GenomicsDBImport - Complete VCF Header will be written to my_database.chr01/vcfheader.vcf 11:48:14.919 INFO GenomicsDBImport - Importing to array - my_database.chr01/genomicsdb_array 11:48:14.924 INFO ProgressMeter - Starting traversal 11:48:14.924 INFO ProgressMeter - Current Locus Elapsed Minutes Batches Processed Batches/Minute 11:48:19.709 INFO GenomicsDBImport - Importing batch 1 with 348 samples 11:48:24.549 INFO GenomicsDBImport - Shutting down engine [November 26, 2023 11:48:24 AM CST] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.27 minutes. Runtime.totalMemory()=12916359168 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at com.intel.genomicsdb.SilentByteBufferStream.(SilentByteBufferStream.java:55) at com.intel.genomicsdb.GenomicsDBImporterStreamWrapper.(GenomicsDBImporterStreamWrapper.java:74) at com.intel.genomicsdb.GenomicsDBImporter.addBufferStream(GenomicsDBImporter.java:1289) at com.intel.genomicsdb.GenomicsDBImporter.addSortedVariantContextIterator(GenomicsDBImporter.java:1212) at com.intel.genomicsdb.GenomicsDBImporter.(GenomicsDBImporter.java:597) at com.intel.genomicsdb.GenomicsDBImporter.(GenomicsDBImporter.java:512) at com.intel.genomicsdb.GenomicsDBImporter.(GenomicsDBImporter.java:472) at com.intel.genomicsdb.GenomicsDBImporter.(GenomicsDBImporter.java:358) at org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport.traverse(GenomicsDBImport.java:502) at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:994) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:135) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:180) at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:199) at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160) at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203) at org.broadinstitute.hellbender.Main.main(Main.java:289)

hougeng avatar Nov 26 '23 06:11 hougeng