sunbeam
sunbeam copied to clipboard
"Bus error" -- issue with snakemake or conda?
Hi Charlie,
Sorry that you're hearing from me so soon. Hit another snag. I think this is related to which conda environment snakemake is trying to use?
Not a reprex, so apologies for all the text:
(sunbeam4.0.0) [litichev@node157 sunbeam]$ sunbeam run all_qc --profile lsf --configfile sunbeam_config.yml
Running: snakemake --snakefile /home/litichev/sunbeam_v4/workflow/Snakefile --conda-prefix /home/litichev/sunbeam_v4/.snakemake all_qc --profile lsf --configfile sunbeam_config.yml
Using profile lsf for setting default command line arguments.
Collecting host/contaminant genomes... done.
Building DAG of jobs...
Using shell: /bin/bash
Provided cluster nodes: 500
Job stats:
job count
------------------------ -------
adapter_removal_unpaired 3
all_qc 1
fastqc 3
fastqc_report 1
find_low_complexity 3
qc_final 3
remove_low_complexity 3
sample_intake 3
trimmomatic_unpaired 3
total 23
Select jobs to execute...
[Thu Aug 24 22:44:01 2023]
rule sample_intake:
input: /project/thaisslab/2023-07_tim_metatranscriptomics/fastq/pgp3_S12_R1_001.fastq.gz
output: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/qc/00_samples/pgp3_1.fastq.gz
log: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/logs/sample_intake_pgp3_1.log
jobid: 15
reason: Missing output files: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/qc/00_samples/pgp3_1.fastq.gz
wildcards: sample=pgp3, rp=1
resources: mem_mb=1000, mem_mib=954, disk_mb=1000, disk_mib=954, tmpdir=<TBD>
Submitted job 15 with external jobid '78781830 logs/cluster/sample_intake/sample=pgp3.rp=1/jobid15_f848f478-04d9-458c-9a20-b46573f7b903.out'.
[Thu Aug 24 22:44:01 2023]
rule sample_intake:
input: /project/thaisslab/2023-07_tim_metatranscriptomics/fastq/pga2_S8_R1_001.fastq.gz
output: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/qc/00_samples/pga2_1.fastq.gz
log: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/logs/sample_intake_pga2_1.log
jobid: 9
reason: Missing output files: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/qc/00_samples/pga2_1.fastq.gz
wildcards: sample=pga2, rp=1
resources: mem_mb=1853, mem_mib=1768, disk_mb=1853, disk_mib=1768, tmpdir=<TBD>
Submitted job 9 with external jobid '78781831 logs/cluster/sample_intake/sample=pga2.rp=1/jobid9_cfae02f6-1da9-46a4-95d8-d8f4f7b0093c.out'.
[Thu Aug 24 22:44:01 2023]
rule sample_intake:
input: /project/thaisslab/2023-07_tim_metatranscriptomics/fastq/714R_S14_R1_001.fastq.gz
output: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/qc/00_samples/714R_1.fastq.gz
log: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/logs/sample_intake_714R_1.log
jobid: 3
reason: Code has changed since last execution
wildcards: sample=714R, rp=1
resources: mem_mb=3149, mem_mib=3004, disk_mb=3149, disk_mib=3004, tmpdir=<TBD>
Submitted job 3 with external jobid '78781832 logs/cluster/sample_intake/sample=714R.rp=1/jobid3_e990c596-adfc-4ee1-8f36-9381fb040a1d.out'.
[Thu Aug 24 22:44:21 2023]
Error in rule sample_intake:
jobid: 15
input: /project/thaisslab/2023-07_tim_metatranscriptomics/fastq/pgp3_S12_R1_001.fastq.gz
output: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/qc/00_samples/pgp3_1.fastq.gz
log: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/logs/sample_intake_pgp3_1.log (check log file(s) for error details)
cluster_jobid: 78781830 logs/cluster/sample_intake/sample=pgp3.rp=1/jobid15_f848f478-04d9-458c-9a20-b46573f7b903.out
Error executing rule sample_intake on cluster (jobid: 15, external: 78781830 logs/cluster/sample_intake/sample=pgp3.rp=1/jobid15_f848f478-04d9-458c-9a20-b46573f7b903.out, jobscript: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/.snakemake/tmp.g5m52_gf/snakejob.sample_intake.15.sh). For error details see the cluster log and the log files of the involved rule(s).
[Thu Aug 24 22:44:32 2023]
Finished job 3.
1 of 23 steps (4%) done
[Thu Aug 24 22:44:42 2023]
Error in rule sample_intake:
jobid: 9
input: /project/thaisslab/2023-07_tim_metatranscriptomics/fastq/pga2_S8_R1_001.fastq.gz
output: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/qc/00_samples/pga2_1.fastq.gz
log: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_output/logs/sample_intake_pga2_1.log (check log file(s) for error details)
cluster_jobid: 78781831 logs/cluster/sample_intake/sample=pga2.rp=1/jobid9_cfae02f6-1da9-46a4-95d8-d8f4f7b0093c.out
Error executing rule sample_intake on cluster (jobid: 9, external: 78781831 logs/cluster/sample_intake/sample=pga2.rp=1/jobid9_cfae02f6-1da9-46a4-95d8-d8f4f7b0093c.out, jobscript: /project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/.snakemake/tmp.g5m52_gf/snakejob.sample_intake.9.sh). For error details see the cluster log and the log files of the involved rule(s).
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Sunbeam failed with error.
Warnings: (0) []
Errors: (7) [56, 60, 63, 68, 72, 75, 77]
No benchmark files found
Complete log: .snakemake/log/2023-08-24T224335.666318.snakemake.log
(sunbeam4.0.0) [litichev@node157 sunbeam]$ cat logs/cluster/sample_intake/sample=pga2.rp=1/jobid9_cfae02f6-1da9-46a4-95d8-d8f4f7b0093c.err
/project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/.snakemake/tmp.g5m52_gf/snakejob.sample_intake.9.sh: line 3: 246369 Bus error (core dumped) /home/litichev/mambaforge/envs/sunbeam4.0.0/bin/python3.11 -m snakemake --snakefile '/home/litichev/sunbeam_v4/workflow/Snakefile' --target-jobs 'sample_intake:sample=pga2,rp=1' --allowed-rules 'sample_intake' --cores 'all' --attempt 1 --force-use-threads --resources 'mem_mb=1853' 'mem_mib=1768' 'disk_mb=1853' 'disk_mib=1768' --wait-for-files '/project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/.snakemake/tmp.g5m52_gf' '/project/thaisslab/2023-07_tim_metatranscriptomics/fastq/pga2_S8_R1_001.fastq.gz' --force --keep-target-files --keep-remote --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --rerun-triggers 'software-env' 'params' 'mtime' 'code' 'input' --skip-script-cleanup --use-conda --conda-frontend 'mamba' --conda-prefix '/home/litichev/sunbeam_v4/.snakemake' --conda-base-path '/home/litichev/mambaforge' --wrapper-prefix 'https://github.com/snakemake/snakemake-wrappers/raw/' --configfiles '/project/thaisslab/2023-07_tim_metatranscriptomics/sunbeam/sunbeam_config.yml' --printshellcmds --latency-wait 30 --scheduler 'ilp' --scheduler-solver-path '/home/litichev/mambaforge/envs/sunbeam4.0.0/bin' --default-resources 'mem_mb=max(2*input.size_mb, 1000)' 'disk_mb=max(2*input.size_mb, 1000)' 'tmpdir=system_tmpdir' --mode 2
I also tried remaking my lsf profile. Here's how it currently looks:
(sunbeam4.0.0) [litichev@node157 sunbeam]$ cat ~/.config/snakemake/lsf/CookieCutter.py
class CookieCutter:
"""
Cookie Cutter wrapper
"""
@staticmethod
def get_default_mem_mb() -> int:
return int("2048")
@staticmethod
def get_log_dir() -> str:
return "logs/cluster"
@staticmethod
def get_default_queue() -> str:
return ""
@staticmethod
def get_default_project() -> str:
return ""
@staticmethod
def get_lsf_unit_for_limits() -> str:
return "MB"
@staticmethod
def get_unknwn_behaviour() -> str:
return "wait"
@staticmethod
def get_zombi_behaviour() -> str:
return "ignore"
@staticmethod
def get_latency_wait() -> float:
return float("30")
@staticmethod
def get_wait_between_tries() -> float:
return float("0.001")
@staticmethod
def get_max_status_checks() -> int:
return int("1")
@staticmethod
def jobscript_timeout() -> int:
return int("10")
(sunbeam4.0.0) [litichev@node157 sunbeam]$ cat ~/.config/snakemake/lsf/config.yaml
latency-wait: "30"
jobscript: "lsf_jobscript.sh"
use-conda: "True"
use-singularity: "False"
printshellcmds: "True"
restart-times: "0"
jobs: "500"
cluster: "lsf_submit.py"
cluster-status: "lsf_status.py"
cluster-cancel: "lsf_cancel.py"
max-jobs-per-second: "10"
max-status-checks-per-second: "10"(sunbeam4.0.0)
Thanks again for your help. Please let me know if I can provide more information.
-Lev
Hi Lev,
I'm not sure what's going on here... but my best guess is that it's something with access to log files. In sunbeam 4 we set a LOG_FP
variable in the main snakefile that's then used by each rule as the base path for where to put logs. I wonder if that's somehow conflicting in a weird way with get_log_dir()
from your CookieCutter file.
Hi Lev, there have been a lot of updates to snakemake in regards to cluster execution (https://github.com/snakemake/snakemake/releases/tag/v8.0.0) which sunbeam >=4.3.7 should incorporate. You may have to do some work to reconfigure your setup still, but most of the work of interacting with the executor should now be handled by this plugin (https://github.com/BEFH/snakemake-executor-plugin-lsf). Let me know if you want any help setting this up (although I haven't worked with this plugin in particular).
Hi Charlie, thanks for following up. I tried using this LSF executor on our cluster. I hit an issue related to this line. I was able to work around it but anyway ended up reverting back to an older version of Snakemake and the old LSF profile.
Alrighty, unfortunately I think, as you found, there would be a bit of work to accommodate all the recent changes if you have already have a working setup built around sunbeam. Let me know if you ever want help with updating.