sarek
sarek copied to clipboard
Giving a user fasta file, but keeping all default fil path
Description of the bug
I provide a fasta file for running Mutect2 and have this error :
A USER ERROR has occurred: Fasta index file file://GRCh38_latest_genomic.fna.fai for reference file://GRCh38_latest_genomic.fna does not exist. Please see https://gatk.broadinstitute.org/hc/articles/360035531652-FASTA-Reference-genome-format for help creating it.
from Mutect2 of GATK.
But my file is here, and exist.
Command used and terminal output
`nextflow run nf-core/sarek -r dev -profile singularity -c custom.config -params-file nf-params.json`
json :
{
"input": "sample.csv",
"outdir": "results",
"wes": "true",
"fasta": "/gpfs/home/plgouttebel/home/exomic/data/ref/GRCh38_latest_genomic.fna",
"aligner": "bwa-mem2",
"tools": "mutect2",
"skip_tools": "baserecalibrator,markduplicates"
}
config :
singularity.cacheDir = '/scratch/plgouttebel/data_Singula/nf-core-sarek_dev/singularity-images'
Output from Log file :
May-07 15:15:33.560 [Task submitter] DEBUG n.executor.local.LocalTaskHandler - Launch cmd line: /bin/bash -ue .command.run
May-07 15:15:33.560 [Task submitter] INFO nextflow.Session - [48/601cc5] Submitted process > NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:MUTECT2 (BR666F)
May-07 15:15:33.655 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
task: name=NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F); work-dir=/scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/a0/84752509ea76ccd51c89f3b8af9c20
error [nextflow.exception.ProcessFailedException]: Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F)` terminated with an error exit status (2)
May-07 15:15:33.763 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F)'
Caused by:
Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES (BR666F)` terminated with an error exit status (2)
Command executed:
gatk --java-options "-Xmx9830M -XX:-UsePerfData" \
GetPileupSummaries \
--input BR666F.sorted.cram \
--variant af-only-gnomad.hg38.vcf.gz \
--output BR666F.mutect2.chr2_16146120-32867130.pileups.table \
--reference GRCh38_latest_genomic.fna \
--intervals chr2_16146120-32867130.bed \
--tmp-dir . \
cat <<-END_VERSIONS > versions.yml
"NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_TUMOR_ONLY_ALL:BAM_VARIANT_CALLING_TUMOR_ONLY_MUTECT2:GETPILEUPSUMMARIES":
gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
END_VERSIONS
Command exit status:
2
Command output:
(empty)
Command error:
Using GATK jar /usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx9830M -XX:-UsePerfData -jar /usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar GetPileupSummaries --input BR666F.sorted.cram --variant af-only-gnomad.hg38.vcf.gz --output BR666F.mutect2.chr2_16146120-32867130.pileups.table --reference GRCh38_latest_genomic.fna --intervals chr2_16146120-32867130.bed --tmp-dir .
13:15:32.947 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/usr/local/share/gatk4-4.5.0.0-0/gatk-package-4.5.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
13:15:33.307 INFO GetPileupSummaries - ------------------------------------------------------------
13:15:33.313 INFO GetPileupSummaries - The Genome Analysis Toolkit (GATK) v4.5.0.0
13:15:33.314 INFO GetPileupSummaries - For support and documentation go to https://software.broadinstitute.org/gatk/
13:15:33.314 INFO GetPileupSummaries - Executing as plgouttebel@n064 on Linux v3.10.0-1160.el7.x86_64 amd64
13:15:33.314 INFO GetPileupSummaries - Java runtime: OpenJDK 64-Bit Server VM v17.0.10-internal+0-adhoc..src
13:15:33.314 INFO GetPileupSummaries - Start Date/Time: May 7, 2024 at 1:15:32 PM GMT
13:15:33.314 INFO GetPileupSummaries - ------------------------------------------------------------
13:15:33.315 INFO GetPileupSummaries - ------------------------------------------------------------
13:15:33.316 INFO GetPileupSummaries - HTSJDK Version: 4.1.0
13:15:33.316 INFO GetPileupSummaries - Picard Version: 3.1.1
13:15:33.316 INFO GetPileupSummaries - Built for Spark Version: 3.5.0
13:15:33.317 INFO GetPileupSummaries - HTSJDK Defaults.COMPRESSION_LEVEL : 2
13:15:33.317 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
13:15:33.317 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
13:15:33.317 INFO GetPileupSummaries - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
13:15:33.318 INFO GetPileupSummaries - Deflater: IntelDeflater
13:15:33.318 INFO GetPileupSummaries - Inflater: IntelInflater
13:15:33.318 INFO GetPileupSummaries - GCS max retries/reopens: 20
13:15:33.318 INFO GetPileupSummaries - Requester pays: disabled
13:15:33.319 INFO GetPileupSummaries - Initializing engine
13:15:33.322 INFO GetPileupSummaries - Shutting down engine
[May 7, 2024 at 1:15:33 PM GMT] org.broadinstitute.hellbender.tools.walkers.contamination.GetPileupSummaries done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=167772160
***********************************************************************
A USER ERROR has occurred: Fasta index file file://GRCh38_latest_genomic.fna.fai for reference file://GRCh38_latest_genomic.fna does not exist. Please see https://gatk.broadinstitute.org/hc/articles/360035531652-FASTA-Reference-genome-format for help creating it.
***********************************************************************
Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
Work dir:
/scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/a0/84752509ea76ccd51c89f3b8af9c20
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
May-07 15:15:33.769 [Task monitor] INFO nextflow.Session - Execution cancelled -- Finishing pending tasks before exit
May-07 15:15:33.795 [main] DEBUG nextflow.Session - Session await > all processes finished
Relevant files
[plgouttebel@login01 nf-core-sarek_dev]$ ls -l /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/a0/84752509ea76ccd51c89f3b8af9c20
total 4
lrwxrwxrwx 1 plgouttebel ubx2 160 May 7 15:15 af-only-gnomad.hg38.vcf.gz -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/3c/e686ef595583a185a5b7f2480f6f94/af-only-gnomad.hg38.vcf.gz
lrwxrwxrwx 1 plgouttebel ubx2 164 May 7 15:15 af-only-gnomad.hg38.vcf.gz.tbi -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/e9/bc174e86314d14b42fab79c5283b02/af-only-gnomad.hg38.vcf.gz.tbi
lrwxrwxrwx 1 plgouttebel ubx2 109 May 7 15:15 BR666F.sorted.cram -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/1a/5ad95b654c06311dc198df39b7a33d/BR666F.sorted.cram
lrwxrwxrwx 1 plgouttebel ubx2 114 May 7 15:15 BR666F.sorted.cram.crai -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/1a/5ad95b654c06311dc198df39b7a33d/BR666F.sorted.cram.crai
lrwxrwxrwx 1 plgouttebel ubx2 117 May 7 15:15 chr2_16146120-32867130.bed -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/b0/fead63adc11db1d9353e4e666e6bf9/chr2_16146120-32867130.bed
lrwxrwxrwx 1 plgouttebel ubx2 69 May 7 15:15 GRCh38_latest_genomic.fna -> /gpfs/home/plgouttebel/home/exomic/data/ref/GRCh38_latest_genomic.fna
lrwxrwxrwx 1 plgouttebel ubx2 162 May 7 15:15 Homo_sapiens_assembly38.dict -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/0f/674a437a17df7ac9f50ac6d50c930c/Homo_sapiens_assembly38.dict
lrwxrwxrwx 1 plgouttebel ubx2 167 May 7 15:15 Homo_sapiens_assembly38.fasta.fai -> /scratch/plgouttebel/data_Singula/nf-core-sarek_dev/work/stage-afe03af1-e05d-4a93-af32-09b63a751b4a/01/63bf12053a02deb319a2f6ac4dbe47/Homo_sapiens_assembly38.fasta.fai
System information
HPC on curta from MCIA (Mésocentre de calcul intensif aquitain) sarek downloaded locally
So from what I can see, issue is that null
should have been assigned to genome
.
But in my opinion, sarek should have either failed early.
Or print a huge warning and recompute the basic index from the fasta file:
fai, dict + needed build index.
This is related to #1253 . Can we track there or is there an additional issue you found?
This is related to #1253 . Can we track there or is there an additional issue you found?
Yeah, that sounds similar to me. Let's close this one over the older issue