BWA-MEME icon indicating copy to clipboard operation
BWA-MEME copied to clipboard

Segmentation fault

Open husamia opened this issue 1 year ago • 9 comments

I build Docker container then downloaded the refences provided. and ran the lowest memory requirement mode. The hardware is Mac M w 64 GB ram. Allowable ram is 50GB.

docker warning: WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

bwa-meme_mode1 version Looking to launch executable "/opt/conda/bin/bwa-meme_mode1.sse42", simd = _mode1.sse42 Launching executable "/opt/conda/bin/bwa-meme_mode1.sse42" Identical to BWA-MEM2 2.2 BWA-MEME v1.0.4 MEME mode 1: uses 38GB for index size in runtime

bwa-meme_mode1 mem -7 -Y -t 1 Homo_sapiens_assembly38.fasta 20A0012672-20A0012672_57977-WGS_R1_001.fastq.gz 20A0012672-20A0012672_57977-WGS_R2_001.fastq.gz -o 20A0012672_bwa-meme.sam

Looking to launch executable "/opt/conda/bin/bwa-meme_mode1.sse42", simd = _mode1.sse42 Launching executable "/opt/conda/bin/bwa-meme_mode1.sse42"

Executing in SSE4.2 mode!!

  • SA compression enabled with xfactor: 8
  • Ref file: Homo_sapiens_assembly38.fasta
  • Entering FMI_search Reading other elements of the index from files Homo_sapiens_assembly38.fasta
  • Index prefix: Homo_sapiens_assembly38.fasta
  • Read 0 ALT contigs
  • Reading reference genome..
  • Binary seq file = Homo_sapiens_assembly38.fasta.0123
  • Reference genome size: 4354060288 bp
  • Done reading reference genome !!

  1. Memory pre-allocation for Chaining: 142.0078 MB
  2. Memory pre-allocation for BSW: 239.6170 MB [M::memoryAllocLearned::MEME] Reading Learned-index models into memory [Learned-Config] MODE:1 SEARCH_METHOD: 1 MEM_TRADEOFF:0 EXPONENTIAL_SMEMSEARCH: 1 DEBUG_MODE:0 Num 2nd Models:102087850 PWL Bits Used:1 [M::memoryAllocLearned::MEME] Loading RMI model and Pac reference file took 32.630 sec [M::memoryAllocLearned::MEME] Reading suffix array into memory [M::memoryAllocLearned::MEME] Loading-index took 42.408 sec
  3. Memory pre-allocation for BWT: 13528.0310 MB

  • Threads used (compute): 1
  • No. of pipeline threads: 2

[0000] read_chunk: 10000000, work_chunk_size: 10000024, nseq: 68388 [0000][ M::kt_pipeline] read 68388 sequences (10000024 bp)... [0000] Reallocating initial memory allocations!! [0000] Calling mem_process_seqs.., task: 0 [0000] 1. Calling kt_for - worker_bwt qemu: uncaught target signal 11 (Segmentation fault) - core dumped Segmentation fault

real 0m50.682s user 0m32.737s sys 0m17.065s

Should I compile it for the Apple M? please give me some instructions to do that.

husamia avatar Aug 18 '22 18:08 husamia

Hi husamia,

Based on your log, the index file you downloaded seems to be incomplete!

The reference size is 4354060288 in your log which should be 6434693834 for Homo_sapiens_assembly38.fasta.

-----------------------------
Executing in SSE4.2 mode!!
-----------------------------
* SA compression enabled with xfactor: 8
* Ref file: /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta
* Entering FMI_search
Reading other elements of the index from files /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta
* Index prefix: /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta
* Read 0 ALT contigs
* Reading reference genome..
* Binary seq file = /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta.0123
* Reference genome size: 6434693834 bp
* Done reading reference genome !!

Can you re-download the index file ( Homo_sapiens_assembly38.fasta.0123, in particular) and retry?

Thank you!

quito418 avatar Aug 19 '22 04:08 quito418

Hi husamia,

Based on your log, the index file you downloaded seems to be incomplete!

The reference size is 4354060288 in your log which should be 6434693834 for Homo_sapiens_assembly38.fasta.

-----------------------------
Executing in SSE4.2 mode!!
-----------------------------
* SA compression enabled with xfactor: 8
* Ref file: /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta
* Entering FMI_search
Reading other elements of the index from files /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta
* Index prefix: /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta
* Read 0 ALT contigs
* Reading reference genome..
* Binary seq file = /ssd2/human_ref/hg38/Homo_sapiens_assembly38.fasta.0123
* Reference genome size: 6434693834 bp
* Done reading reference genome !!

Can you re-download the index file ( Homo_sapiens_assembly38.fasta.0123, in particular) and retry?

Thank you!

I am having trouble downloading the full files due to time out and slow. Do you have any suggestions for making download faster and more robust?

husamia avatar Aug 19 '22 17:08 husamia

If you are using linux cmd line, you can use below command to download large file.

  • -c option restarts download from where it stopped., -T is for timeout
#!/bin/bash
while true;do wget -T 15 -c $1   && break;done

quito418 avatar Aug 20 '22 00:08 quito418

If you are using linux cmd line, you can use below command to download large file.

  • -c option restarts download from where it stopped., -T is for timeout
#!/bin/bash
while true;do wget -T 15 -c $1   && break;done

can you post the md5sum for the large files

husamia avatar Aug 20 '22 15:08 husamia

Thats a good suggestion, I also added the md5sum in the webpage for all files web.inalab.net/~bwa-meme/. Thanks

Homo_sapiens_assembly38.fasta 7ff134953dcca8c8997453bbb80b6b5e  
Homo_sapiens_assembly38.fasta.0123 da7b1691d6284a69c465217f1d60ca82  
Homo_sapiens_assembly38.fasta.amb e4dc4fdb7358198e0847106599520aa9  
Homo_sapiens_assembly38.fasta.ann af611ed0bb9487fb1ba4aa1a7e7ad21c  
Homo_sapiens_assembly38.fasta.pac 178862a79b043a2f974ef10e3877ef86  
Homo_sapiens_assembly38.fasta.pos_packed 08e0042ca5b78d02c65d3780824f6bc9  
Homo_sapiens_assembly38.fasta.suffixarray_uint64_L0_PARAMETERS 6c0d6dc7e733a7f373aa7b2730621aa4  
Homo_sapiens_assembly38.fasta.suffixarray_uint64_L1_PARAMETERS 632d74f323b4af7ab54be0127d33d2d9  
Homo_sapiens_assembly38.fasta.suffixarray_uint64_L2_PARAMETERS 2e36a41ac2e8fdda4a3de26fbd2c8e11  

quito418 avatar Aug 21 '22 02:08 quito418

Thanks. The file Homo_sapiens_assembly38.fasta.suffixarray_uint64_L0_PARAMETERS URL isn't listed in the downloads but I downloaded it anyway.

I am having alot of trouble getting Docker to run on the Apple M Studio. I want the best performance, so you suggest I run BWA-MEME natively? can you provide instructions on how to install/compile for the M chip?

husamia avatar Aug 22 '22 16:08 husamia

Are you still having other seg fault problem even though you have the valid index files? I guess it should have resolved the issue.

Based on what I have tried before, there are some difference in compilation using Apple device. I am currently on a buisiness trip for 2 weeks, I will try it and let you know as soon as I get back.

quito418 avatar Aug 22 '22 20:08 quito418

Are you still having other seg fault problem even though you have the valid index files? I guess it should have resolved the issue.

Based on what I have tried before, there are some difference in compilation using Apple device. I am currently on a buisiness trip for 2 weeks, I will try it and let you know as soon as I get back.

I am running BWA-MEME inside a docker container on the Apple M which is causing lots of problems due to docker issues.

husamia avatar Aug 23 '22 00:08 husamia

I am not getting segmentation fault. your right it was the files incomplete. but I am now trying to figure out why it's getting killed


# time bwa-meme mem -Y -7 -t 4 -o WGS40x_bwa-meme.BAM Homo_sapiens_assembly38.fasta WGS40X.cram
Looking to launch executable "/opt/conda/bin/bwa-meme_mode3.sse42", simd = _mode3.sse42
Launching executable "/opt/conda/bin/bwa-meme_mode3.sse42"
-----------------------------
Executing in SSE4.2 mode!!
-----------------------------
* SA compression enabled with xfactor: 8
* Ref file: Homo_sapiens_assembly38.fasta
* Entering FMI_search
Reading other elements of the index from files Homo_sapiens_assembly38.fasta
* Index prefix: Homo_sapiens_assembly38.fasta
* Read 0 ALT contigs
* Reading reference genome..
* Binary seq file = Homo_sapiens_assembly38.fasta.0123
* Reference genome size: 6434693834 bp
* Done reading reference genome !!

------------------------------------------
1. Memory pre-allocation for Chaining: 567.9670 MB
2. Memory pre-allocation for BSW: 958.4681 MB
[M::memoryAllocLearned::MEME] Reading Learned-index models into memory
[Learned-Config] MODE:3 SEARCH_METHOD: 1 MEM_TRADEOFF:1 EXPONENTIAL_SMEMSEARCH: 1 DEBUG_MODE:0 Num 2nd Models:268435456 PWL Bits Used:28
[M::memoryAllocLearned::MEME] Loading RMI model and Pac reference file took 203.104 sec
[M::memoryAllocLearned::MEME] Reading suffix array into memory
[M::memoryAllocLearned::MEME] Loading pos_packed file took 1039.431 sec
[M::memoryAllocLearned::MEME] Generating SA, 64-bit Suffix and ISA in memory
Killed

real	19m25.643s
user	0m48.035s
sys	2m22.667s

I will try bwa-meme_mode1

husamia avatar Aug 30 '22 16:08 husamia