sarek
sarek copied to clipboard
error in test by haplotypecaller
Description of the bug
I cannot finish the test of haplotypecaller tool on singularity container. Of course, when I run my germline samples, I obtained the same error. The pipeline fails on the last step of variant calling step. I check with strelka tool, and the pipeline test finish correctly.
Command used and terminal output
nextflow run 3_4_0/main.nf -profile test,singularity --tools haplotypecaller --outdir ./results
[15/8868cb] process > NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:VCF_VARIANT_FILTERING_GATK:FILTERVA... [100%] 1 of 1, failed: 1 ✘
[- ] process > NFCORE_SAREK:SAREK:VCF_QC_BCFTOOLS_VCFTOOLS:BCFTOOLS_STATS -
[- ] process > NFCORE_SAREK:SAREK:VCF_QC_BCFTOOLS_VCFTOOLS:VCFTOOLS_TSTV_COUNT -
[- ] process > NFCORE_SAREK:SAREK:VCF_QC_BCFTOOLS_VCFTOOLS:VCFTOOLS_TSTV_QUAL -
[- ] process > NFCORE_SAREK:SAREK:VCF_QC_BCFTOOLS_VCFTOOLS:VCFTOOLS_SUMMARY -
[- ] process > NFCORE_SAREK:SAREK:CUSTOM_DUMPSOFTWAREVERSIONS -
[- ] process > NFCORE_SAREK:SAREK:MULTIQC -
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/sarek] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:VCF_VARIANT_FILTERING_GATK:FILTERVARIANTTRANCHES (test)'
Caused by:
Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:VCF_VARIANT_FILTERING_GATK:FILTERVARIANTTRANCHES (test)` terminated with an error exit status (2)
Command executed:
gatk --java-options "-Xmx5324M -XX:-UsePerfData" \
FilterVariantTranches \
--variant test.cnn.vcf.gz \
--resource dbsnp_146.hg38.vcf.gz --resource mills_and_1000G.indels.vcf.gz \
--output test.haplotypecaller.filtered.vcf.gz \
--tmp-dir . \
--info-key CNN_1D
cat <<-END_VERSIONS > versions.yml
"NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:VCF_VARIANT_FILTERING_GATK:FILTERVARIANTTRANCHES":
gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
END_VERSIONS
Command exit status:
2
Command output:
(empty)
Command error:
Using GATK jar /usr/local/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx5324M -XX:-UsePerfData -jar /usr/local/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar FilterVariantTranches --variant test.cnn.vcf.gz --resource dbsnp_146.hg38.vcf.gz --resource mills_and_1000G.indels.vcf.gz --output test.haplotypecaller.filtered.vcf.gz --tmp-dir . --info-key CNN_1D
16:07:18.011 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
16:07:18.048 INFO FilterVariantTranches - ------------------------------------------------------------
16:07:18.052 INFO FilterVariantTranches - The Genome Analysis Toolkit (GATK) v4.4.0.0
16:07:18.052 INFO FilterVariantTranches - For support and documentation go to https://software.broadinstitute.org/gatk/
16:07:18.052 INFO FilterVariantTranches - Executing as bague@cbp10055 on Linux v5.15.0-92-generic amd64
16:07:18.052 INFO FilterVariantTranches - Java runtime: OpenJDK 64-Bit Server VM v17.0.3-internal+0-adhoc..src
16:07:18.052 INFO FilterVariantTranches - Start Date/Time: February 7, 2024 at 4:07:17 PM GMT
16:07:18.052 INFO FilterVariantTranches - ------------------------------------------------------------
16:07:18.052 INFO FilterVariantTranches - ------------------------------------------------------------
16:07:18.053 INFO FilterVariantTranches - HTSJDK Version: 3.0.5
16:07:18.053 INFO FilterVariantTranches - Picard Version: 3.0.0
16:07:18.053 INFO FilterVariantTranches - Built for Spark Version: 3.3.1
16:07:18.053 INFO FilterVariantTranches - HTSJDK Defaults.COMPRESSION_LEVEL : 2
16:07:18.054 INFO FilterVariantTranches - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
16:07:18.054 INFO FilterVariantTranches - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
16:07:18.054 INFO FilterVariantTranches - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
16:07:18.054 INFO FilterVariantTranches - Deflater: IntelDeflater
16:07:18.054 INFO FilterVariantTranches - Inflater: IntelInflater
16:07:18.054 INFO FilterVariantTranches - GCS max retries/reopens: 20
16:07:18.054 INFO FilterVariantTranches - Requester pays: disabled
16:07:18.055 INFO FilterVariantTranches - Initializing engine
16:07:18.129 INFO FeatureManager - Using codec VCFCodec to read file file://dbsnp_146.hg38.vcf.gz
16:07:18.132 WARN IntelInflater - Zero Bytes Written : 0
16:07:18.139 INFO FeatureManager - Using codec VCFCodec to read file file://mills_and_1000G.indels.vcf.gz
16:07:18.140 WARN IntelInflater - Zero Bytes Written : 0
16:07:18.146 INFO FeatureManager - Using codec VCFCodec to read file file://test.cnn.vcf.gz
16:07:18.147 WARN IntelInflater - Zero Bytes Written : 0
16:07:18.148 WARN IntelInflater - Zero Bytes Written : 0
16:07:18.152 INFO FilterVariantTranches - Done initializing engine
16:07:18.168 INFO ProgressMeter - Starting traversal
16:07:18.169 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
16:07:18.169 INFO FilterVariantTranches - Starting pass 0 through the variants
16:07:18.170 WARN IntelInflater - Zero Bytes Written : 0
16:07:18.171 INFO FilterVariantTranches - Finished pass 0 through the variants
16:07:18.171 INFO FilterVariantTranches - Found 0 SNPs and 0 indels with INFO score key:CNN_1D.
16:07:18.171 INFO FilterVariantTranches - Found 0 SNPs and 0 indels in the resources.
16:07:18.171 INFO FilterVariantTranches - Filtered 0 SNPs out of 0 and filtered 0 indels out of 0 with INFO score: CNN_1D.
16:07:18.173 INFO FilterVariantTranches - Shutting down engine
[February 7, 2024 at 4:07:18 PM GMT] org.broadinstitute.hellbender.tools.walkers.vqsr.FilterVariantTranches done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=125829120
***********************************************************************
A USER ERROR has occurred: Bad input: VCF contains no variants or no variants with INFO score key "CNN_1D"
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Work dir:
/media/bague/D_2/descriptiu_marato_tv3/prova_nf_3_4_0/work/15/8868cbf90b18654a33c937f88a5bc9
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-- Check '.nextflow.log' file for details
Relevant files
No response
System information
Nextflow version 23.04.0
Hardware Desktop
Executor local
Container engine: Singularity
Os: Ubuntu 20.0.04
Version of nf-core/sarek 3.4.0
I try to use both GATK genomes (GRCh37 and GRCh38) but I cannot skip the error. I am not sure why I cannot use haplotypecaller but I can run all pipeline by strelka caller. It looks like that error begins when the pipeline needs the gatk singularity container...
I also reported that error: https://github.com/nf-core/sarek/issues/1146
Could you try running the pipeline with the option --skip_tools haplotyper_filter?
https://nf-co.re/sarek/3.4.0/parameters#skip_tools
I have just tried but the pipeline continues finishing with errors...
Come on over on nf-core/sarek slack, and let's take a look at those errors
Excuse me, I introduce the option --skip_tools haplotypecaller_filter and I can skip the error. The pipeline finished correctly. However, I have a doubt: If we introduce this parameter to overpass the error while we are removing filtering steps, how we should analyze downstream? Is it recommendatory to add any filter extra directly against the vcf resulted?
Excuse me, I introduce the option --skip_tools haplotypecaller_filter and I can skip the error. The pipeline finished correctly. However, I have a doubt: If we introduce this parameter to overpass the error while we are removing filtering steps, how we should analyze downstream? Is it recommendatory to add any filter extra directly against the vcf resulted?
I would run Sarek with the haplotypecaller-filter activated and only deactivate it for the odd sample with "no variants or no variants with INFO score key CNN_1D" - perhaps @maxulysse, @FriederikeHanssen or @tdanhorn has more to say on this?