terminated for an unknown reason -- Likely it has been terminated by the external system
Description of the bug
i don not know why there are such errors
Command used and terminal output
Caused by:
Process `NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:BAM_VARIANT_CALLING_HAPLOTYPECALLER:GATK4_HAPLOTYPECALLER (AK58_C_1)` terminated for an unknown reason -- Likely it has been terminated by the external system
Command executed:
gatk --java-options "-Xmx163840M -XX:-UsePerfData" \
HaplotypeCaller \
--input AK58_C_1.md.cram \
--output AK58_C_1.haplotypecaller.chr2A_part1_1-384157900.g.vcf.gz \
--reference wheat_AK58v4MP.genome_part.fa \
\
--intervals chr2A_part1_1-384157900.bed \
\
\
--tmp-dir . \
-ERC GVCF
cat <<-END_VERSIONS > versions.yml
"NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:BAM_VARIANT_CALLING_HAPLOTYPECALLER:GATK4_HAPLOTYPECALLER":
gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
END_VERSIONS
Command exit status:
-
Command output:
(empty)
Work dir:
/public/home/fanrong/work_lei/01_htt/01_240927_well_bse_bsr/01_bsr/02_sarek/02_vcf/work/bd/a8da3f055c7df258c4271504de32d7
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details
Relevant files
No response
System information
No response
this happens when the scheduler kills your job, typically because you either run out of memory or time. To investigate, can you take a look at the .command.log file of the failing task? You can find it in the work directory: /public/home/fanrong/work_lei/01_htt/01_240927_well_bse_bsr/01_bsr/02_sarek/02_vcf/work/bd/a8da3f055c7df258c4271504de32d7
no .command.log in file,just have .command.run and .command.sh.
@fan040 , I've had this recently, with only the .command.run and .command.sh existing. It means that your scheduler tried to schedule a job but something happened to make it fail before it ever started running. I'd suggest you just retry. If it happens a lot, talk to your cluster sysadmins.
I've been seeing this on a virtual slurm cluster created by AWS Parallel Cluster. In that case, I think it is when the worker node not being started properly (maybe because there were no nodes of that type available).
One complication is that this means there is no exit code returned to Nextflow, so that the standard retry strategy, which is checking for a certain subset of codes and automatically retrying those, doesn't work. I ended up just setting to always retry (errorStrategy = 'retry').
thank you for the response:)
| | 饷晴 | | @.*** |
---- Replied Message ---- | From | Simon @.> | | Date | 11/08/2024 23:44 | | To | @.> | | Cc | @.>@.> | | Subject | Re: [nf-core/sarek] terminated for an unknown reason -- Likely it has been terminated by the external system (Issue #1676) |
@fan040 , I've had this recently, with only the .command.run and .command.sh existing. It means that your scheduler tried to schedule a job but something happened to make it fail before it ever started running. I'd suggest you just retry. If it happens a lot, talk to your cluster sysadmins.
I've been seeing this on a virtual slurm cluster created by AWS Parallel Cluster. In that case, I think it is when the worker node not being started properly (maybe because there were no nodes of that type available).
One complication is that this means there is no exit code returned to Nextflow, so that the standard retry strategy, which is checking for a certain subset of codes and automatically retrying those, doesn't work. I ended up just setting to always retry (errorStrategy = 'retry').
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>