sunbeam icon indicating copy to clipboard operation
sunbeam copied to clipboard

test failed

Open boaty opened this issue 3 years ago • 3 comments

Hi,

I run sunbeam test after installing but failed. the log mentioned something about memory, but I do not know exactly the bug is.

Thanks

---------------------------error line---------------------------

[Tue Mar 16 09:37:09 2021] rule megahit_paired: input: /tmp/tmp.vxjoert6aS/sunbeam_output/qc/decontam/dummyecoli_1.fastq.gz, /tmp/tmp.vxjoert6aS/sunbeam_output/qc/decontam/dummyecoli_2.fastq.gz output: /tmp/tmp.vxjoert6aS/sunbeam_output/assembly/megahit/dummyecoli_asm/final.contigs.fa jobid: 61 wildcards: sample=dummyecoli

    ## turn off bash strict mode
    set +o pipefail

    ## sometimes the error is due to lack of memory
    exitcode=0
    megahit -t 1 -1 /tmp/tmp.vxjoert6aS/sunbeam_output/qc/decontam/dummyecoli_1.fastq.gz -2 /tmp/tmp.vxjoert6aS/sunbeam_output/qc/decontam/dummyecoli_2.fastq.gz -o /tmp/tmp.vxjoert6aS/sunbeam_output/assembly/megahit/dummyecoli_asm --continue || exitcode=$?

    if [ $exitcode -eq 255 ]
    then
        echo "Empty contigs"
        touch /tmp/tmp.vxjoert6aS/sunbeam_output/assembly/megahit/dummyecoli_asm/final.contigs.fa
    elif [ $exitcode -gt 1 ]
    then
        echo "Check your memory"
    fi
    

Cannot find /tmp/tmp.vxjoert6aS/sunbeam_output/assembly/megahit/dummyecoli_asm/opts.txt Please check whether the output directory is correctly set by "-o" Now switching to normal mode. 1006.548Gb memory in total. Using: 905.893Gb. MEGAHIT v1.1.3 --- [Tue Mar 16 09:37:09 2021] Start assembly. Number of CPU threads 1 --- --- [Tue Mar 16 09:37:09 2021] Available memory: 1080772259840, used: 972695033856 --- [Tue Mar 16 09:37:09 2021] Converting reads to binaries --- b' [read_lib_functions-inl.h : 209] Lib 0 (/tmp/tmp.vxjoert6aS/sunbeam_output/qc/decontam/dummyecoli_1.fastq.gz,/tmp/tmp.vxjoert6aS/sunbeam_output/qc/decontam/dummyecoli_2.fastq.gz): pe, 1722 reads, 250 max length' b' [utils.h : 126] Real: 0.0147 user: 0.0083 sys: 0.0041 maxrss: 8272' --- [Tue Mar 16 09:37:09 2021] k list: 21,29,39,59,79,99,119,141 --- --- [Tue Mar 16 09:37:09 2021] Extracting solid (k+1)-mers for k = 21 --- --- [Tue Mar 16 09:37:10 2021] Building graph for k = 21 --- --- [Tue Mar 16 09:37:10 2021] Assembling contigs from SdBG for k = 21 --- Error occurs when assembling contigs for k = 21, please refer to /tmp/tmp.vxjoert6aS/sunbeam_output/assembly/megahit/dummyecoli_asm/log for detail [Exit code -11] Waiting at most 5 seconds for missing files. MissingOutputException in line 14 of /home/Desktop/sunbeamTest/sunbeam-stable/rules/assembly/assembly.rules: Missing files after 5 seconds: /tmp/tmp.vxjoert6aS/sunbeam_output/assembly/megahit/dummyecoli_asm/final.contigs.fa This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait. Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /home/Desktop/sunbeamTest/sunbeam-stable/.snakemake/log/2021-03-16T093633.097858.snakemake.log x (log: /tmp/tmp.vxjoert6aS/test_all.[out/err]) -- TESTS FAILED --

boaty avatar Mar 16 '21 01:03 boaty

This looks to be the same issue (with Exit code -11) as a few people have reported in #277. @boaty are you able to try the dev branch instead of stable and see if that fixes your problem? We've made a number of improvements there that should become the stable version soon (as version 3.0).

ressy avatar Apr 01 '21 21:04 ressy

I can see this occur with the stable branch (after fixing some unrelated package dependency problems we've fixed in the dev branch) including error -11. Apparently that implies a segmentation fault, and when I pull the exact command the megahit wrapper runs based on the log, I can see it occur:

megahit_asm_core assemble -s testdir/sunbeam_output/assembly/megahit/dummybfragilis_asm/tmp/k21/21 -o testdir/sunbeam_output/assembly/megahit/dummybfragilis_asm/intermediate_contigs/k21 -t 1 --min_standalone 300.0 --prune_level 2 --merge_len 20 --merge_similar 0.95 --low_local_ratio 0.2 --min_depth 2 --bubble_level 2 --max_tip_len -1 --careful_bubble
    [assembler.cpp             : 148]     Loading succinct de Bruijn graph: testdir/sunbeam_output/assembly/megahit/dummybfragilis_asm/tmp/k21/21 Done. Time elapsed: 0.076102
    [assembler.cpp             : 152]     Number of Edges: 18538; K value: 21
    [assembler.cpp             : 162]     Number of CPU threads: 1
    [assembly_algorithms.cpp   : 162]     Removing tips with length less than 2; Accumulated tips removed: 0; time elapsed: 0.0034
    [assembly_algorithms.cpp   : 162]     Removing tips with length less than 4; Accumulated tips removed: 0; time elapsed: 0.0012
    [assembly_algorithms.cpp   : 162]     Removing tips with length less than 8; Accumulated tips removed: 0; time elapsed: 0.0023
    [assembly_algorithms.cpp   : 162]     Removing tips with length less than 16; Accumulated tips removed: 0; time elapsed: 0.0013
    [assembly_algorithms.cpp   : 162]     Removing tips with length less than 32; Accumulated tips removed: 0; time elapsed: 0.0018
    [assembly_algorithms.cpp   : 170]     Removing tips with length less than 42; Accumulated tips removed: 0; time elapsed: 0.0014
    [assembler.cpp             : 179]     Tips removal done! Time elapsed(sec): 0.011268
    [assembler.cpp             : 188]     unitig graph size: 5, time for building: 0.008339
    [assembler.cpp             : 211]     Number of bubbles removed: 0, Time elapsed(sec): 0.000029
    [assembler.cpp             : 225]     Number of complex bubbles removed: 0, Time elapsed(sec): 0.000005
Segmentation fault (core dumped)

Strange that it does seem OK in the dev branch despite having the same megahit version and input files, though.

ressy avatar Apr 01 '21 21:04 ressy

I think I see it. The very last few calls megahit_asm_core makes (according to ltrace) before crashing are:

GOMP_loop_end_nowait(2, 0x7fdea4fef050, 48, 0)                                                                            = 0
GOMP_parallel_end(2, 0x7fdea4fef050, 48, 0)                                                                               = 0
GOMP_parallel_start(0x581020, 0x7ffe3bd53880, 0, 0)                                                                       = 0
omp_get_num_threads(0x7ffe3bd53880, 0x1d965c0, 0x7fdea507afe0, 5)                                                         = 1
omp_get_thread_num(2, 0x7fdea4fef050, 0x1d8a300, 0x7fdea4fee720)                                                          = 0
omp_unset_lock(0x1d72720, 0, 0, 0x7fdea4fee720 <no return ...>

which I think implies it dies in omp_unset_lock (?) I can run the exact same command with the binary installed via my dev branch on exactly the same input files, no problem.

If I look in my two environments for what package metadata mentions libgomp.so I see different packages:

home/jesse/miniconda3/envs/sunbeam-dev-test/conda-meta/libgomp-9.3.0-h2828fa1_18.json:    "lib/libgomp.so",
/home/jesse/miniconda3/envs/sunbeam-dev-test/conda-meta/libgomp-9.3.0-h2828fa1_18.json:    "lib/libgomp.so.1.0.0",
/home/jesse/miniconda3/envs/sunbeam-dev-test/conda-meta/libgomp-9.3.0-h2828fa1_18.json:        "_path": "lib/libgomp.so",
/home/jesse/miniconda3/envs/sunbeam-dev-test/conda-meta/libgomp-9.3.0-h2828fa1_18.json:        "_path": "lib/libgomp.so.1.0.0",
/home/jesse/miniconda3/envs/sunbeam-dev-test/conda-meta/_openmp_mutex-4.5-1_gnu.json:    "lib/libgomp.so.1"
/home/jesse/miniconda3/envs/sunbeam-dev-test/conda-meta/_openmp_mutex-4.5-1_gnu.json:        "_path": "lib/libgomp.so.1",
/home/jesse/miniconda3/envs/sunbeam-stable-test/conda-meta/_openmp_mutex-4.5-1_llvm.json:    "lib/libgomp.so.1"
/home/jesse/miniconda3/envs/sunbeam-stable-test/conda-meta/_openmp_mutex-4.5-1_llvm.json:        "_path": "lib/libgomp.so.1",

I'm thinking the _openmp_mutex-4.5-1_llvm provided with a stable branch install leads to the crash and the _openmp_mutex-4.5-1_gnu one from a dev branch install doesn't.

So yeah long story short, dev branch should do the trick.

ressy avatar Apr 01 '21 22:04 ressy

Closing this as it should work in sunbeam3. If you run into this problem again please open a new issue.

Ulthran avatar Sep 29 '22 15:09 Ulthran