minimap icon indicating copy to clipboard operation
minimap copied to clipboard

seg fault

Open bwlang opened this issue 8 years ago • 7 comments

I just tried out minimap on a bunch of pacbio reads like this:

#!/bin/sh
PATH="/mnt/home/langhorst/src/miniasm:/mnt/home/langhorst/src/minimap:$PATH"
if ! [ -f deer_1123.mmi ]; then
  minimap -d deer_1123.mmi deer_1123.fasta
fi
minimap -Sw5 -L100 -m0 -t14 -l deer_1123.mmi  deer_1123.fasta | pigz -p 8 > deer_1123_minimap.paf.gz

The mmi build seems to have worked... but mimimap failed after writing 39G of data to deer_1123_minimap.paf.gz

Not sure if this is very helpful... I'll try to rebuild and re-run to get you a stacktrace.

bwlang avatar Jan 06 '16 20:01 bwlang

Could you try without generating the .mmi first? This segfault is probably caused by the different "-w" in indexing (by default 10) and in mapping (you are using 5). Although I have considered this scenario, I may have overlooked some corner cases.

lh3 avatar Jan 06 '16 21:01 lh3

Or if you want to index first, run this minimap -w 5 -d deer_1123.mmi deer_1123.fasta. i.e., you add -w5 to the command line, but better to do the mapping without a separate indexing step if possible.

lh3 avatar Jan 06 '16 21:01 lh3

I thought this worked at first... but it seems not.

[M::mm_idx_gen::236.783*1.53] collected minimizers
[M::mm_idx_gen::290.892*1.91] sorted minimizers
[M::main::290.894*1.91] loaded/built the index for 1677906 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 155
[M::mm_idx_gen::18268.699*13.22] collected minimizers
[M::mm_idx_gen::18520.546*13.06] sorted minimizers
[M::main::18520.574*13.06] loaded/built the index for 792499 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 149
[M::mm_idx_gen::34833.068*13.22] collected minimizers
[M::mm_idx_gen::35093.172*13.13] sorted minimizers
[M::main::35093.226*13.13] loaded/built the index for 658146 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 144
[M::mm_idx_gen::51264.032*13.20] collected minimizers
[M::mm_idx_gen::51493.789*13.14] sorted minimizers
[M::main::51493.822*13.14] loaded/built the index for 378037 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 132
[M::mm_idx_gen::65902.963*13.18] collected minimizers
[M::mm_idx_gen::66136.304*13.14] sorted minimizers
[M::main::66136.369*13.14] loaded/built the index for 375892 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 138
[M::mm_idx_gen::81003.887*13.16] collected minimizers
[M::mm_idx_gen::81243.894*13.13] sorted minimizers
[M::main::81243.957*13.13] loaded/built the index for 398680 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 154
[M::mm_idx_gen::96979.714*13.14] collected minimizers
[M::mm_idx_gen::97218.345*13.11] sorted minimizers
[M::main::97218.391*13.11] loaded/built the index for 396935 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 154
[M::mm_idx_gen::113011.294*13.13] collected minimizers
[M::mm_idx_gen::113263.256*13.10] sorted minimizers
[M::main::113263.274*13.10] loaded/built the index for 411812 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 151
[M::mm_idx_gen::129364.684*13.12] collected minimizers
[M::mm_idx_gen::129624.695*13.10] sorted minimizers
[M::main::129624.719*13.10] loaded/built the index for 438142 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 145
[M::mm_idx_gen::145573.009*13.13] collected minimizers
[M::mm_idx_gen::145843.364*13.10] sorted minimizers
[M::main::145843.389*13.10] loaded/built the index for 447196 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 140
[M::mm_idx_gen::162013.910*13.12] collected minimizers
[M::mm_idx_gen::162247.691*13.10] sorted minimizers
[M::main::162247.714*13.10] loaded/built the index for 533167 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 130
[M::mm_idx_gen::178735.329*13.12] collected minimizers
[M::mm_idx_gen::178965.181*13.10] sorted minimizers
[M::main::178965.235*13.10] loaded/built the index for 558892 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 122
[M::mm_idx_gen::195444.349*13.12] collected minimizers
[M::mm_idx_gen::195677.588*13.10] sorted minimizers
[M::main::195677.628*13.10] loaded/built the index for 461246 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 119
[M::mm_idx_gen::211691.807*13.11] collected minimizers
[M::mm_idx_gen::211924.132*13.10] sorted minimizers
[M::main::211924.151*13.10] loaded/built the index for 496410 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 136
[M::mm_idx_gen::228844.379*13.11] collected minimizers
[M::mm_idx_gen::229052.034*13.10] sorted minimizers
[M::main::229052.070*13.10] loaded/built the index for 536872 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 145
[M::mm_idx_gen::246832.665*13.11] collected minimizers
[M::mm_idx_gen::247081.024*13.10] sorted minimizers
[M::main::247081.092*13.10] loaded/built the index for 455520 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 141
[M::mm_idx_gen::264160.757*13.10] collected minimizers
[M::mm_idx_gen::264420.048*13.09] sorted minimizers
[M::main::264420.071*13.09] loaded/built the index for 455449 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 140
[M::mm_idx_gen::281620.361*13.10] collected minimizers
[M::mm_idx_gen::281845.758*13.09] sorted minimizers
[M::main::281845.798*13.09] loaded/built the index for 537658 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 129
[M::mm_idx_gen::299335.656*13.10] collected minimizers
[M::mm_idx_gen::299578.665*13.09] sorted minimizers
[M::main::299578.709*13.09] loaded/built the index for 530301 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 133
[M::mm_idx_gen::317145.245*13.11] collected minimizers
[M::mm_idx_gen::317377.033*13.10] sorted minimizers
[M::main::317377.053*13.10] loaded/built the index for 463508 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 144
[M::mm_idx_gen::335430.541*13.11] collected minimizers
[M::mm_idx_gen::335681.535*13.10] sorted minimizers
[M::main::335681.596*13.10] loaded/built the index for 538322 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 146
[M::mm_idx_gen::354210.692*13.11] collected minimizers
[M::mm_idx_gen::354467.899*13.10] sorted minimizers
[M::main::354467.934*13.10] loaded/built the index for 567736 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 148
[M::mm_idx_gen::373364.683*13.11] collected minimizers
[M::mm_idx_gen::373621.886*13.10] sorted minimizers
[M::main::373621.934*13.10] loaded/built the index for 468995 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 149
[M::mm_idx_gen::392301.976*13.11] collected minimizers
[M::mm_idx_gen::392544.632*13.10] sorted minimizers
[M::main::392544.695*13.10] loaded/built the index for 502773 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 147
[M::mm_idx_gen::411395.713*13.11] collected minimizers
[M::mm_idx_gen::411636.693*13.10] sorted minimizers
[M::main::411636.744*13.10] loaded/built the index for 540353 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 144
[M::mm_idx_gen::430795.966*13.10] collected minimizers
[M::mm_idx_gen::431053.123*13.10] sorted minimizers
[M::main::431053.238*13.10] loaded/built the index for 519773 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 133
[M::mm_idx_gen::450007.196*13.10] collected minimizers
[M::mm_idx_gen::450253.556*13.09] sorted minimizers
[M::main::450253.667*13.09] loaded/built the index for 525965 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 139
[M::mm_idx_gen::469533.060*13.10] collected minimizers
[M::mm_idx_gen::469753.954*13.09] sorted minimizers
[M::main::469753.990*13.09] loaded/built the index for 619660 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 140
[M::mm_idx_gen::489883.118*13.10] collected minimizers
[M::mm_idx_gen::490122.770*13.09] sorted minimizers
[M::main::490122.803*13.09] loaded/built the index for 584892 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 154
[M::mm_idx_gen::510672.568*13.09] collected minimizers
[M::mm_idx_gen::510932.268*13.09] sorted minimizers
[M::main::510932.281*13.09] loaded/built the index for 546498 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 136
[M::mm_idx_gen::530533.039*13.09] collected minimizers
[M::mm_idx_gen::530778.364*13.09] sorted minimizers
[M::main::530778.385*13.09] loaded/built the index for 516080 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 130
[M::mm_idx_gen::550814.328*13.08] collected minimizers
[M::mm_idx_gen::551077.668*13.07] sorted minimizers
[M::main::551077.669*13.07] loaded/built the index for 514490 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 149
[M::mm_idx_gen::570993.418*13.09] collected minimizers
[M::mm_idx_gen::571229.684*13.08] sorted minimizers
[M::main::571229.685*13.08] loaded/built the index for 420940 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 146
[M::mm_idx_gen::590802.766*13.10] collected minimizers
[M::mm_idx_gen::590831.154*13.10] sorted minimizers
[M::main::590831.154*13.10] loaded/built the index for 462720 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 120
[M::main] Version: 0.2-r124-dirty
[M::main] CMD: minimap -Sw5 -L100 -m0 -t14 deer_1123.fasta deer_1123.fasta
[M::main] Real time: 606601.471 sec; CPU: 7956945.192 sec
[M::main] ===> Step 1: reading read mappings <===
Segmentation fault (core dumped)

bwlang avatar Jan 15 '16 20:01 bwlang

This segfault is probably caused by miniasm due to insufficient memory. The minimap alignment is done.

lh3 avatar Jan 15 '16 20:01 lh3

I have the same issue.

Here is the error log --

[M::mm_idx_gen::109.703*1.87]
collected minimizers
[M::mm_idx_gen::164.478*2.63] sorted minimizers
[M::main::164.479*2.63] loaded/built the index for 356437 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 233
[M::mm_idx_gen::6563.931*5.52] collected minimizers
[M::mm_idx_gen::6580.290*5.52] sorted minimizers
[M::main::6580.290*5.52] loaded/built the index for 495373 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 313
[M::mm_idx_gen::12517.230*5.57] collected minimizers
[M::mm_idx_gen::12538.276*5.57] sorted minimizers
[M::main::12538.277*5.57] loaded/built the index for 494809 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 275
[M::mm_idx_gen::18615.455*5.59] collected minimizers
[M::mm_idx_gen::18638.838*5.59] sorted minimizers
[M::main::18638.838*5.59] loaded/built the index for 502021 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 281
[M::mm_idx_gen::23849.899*5.64] collected minimizers
[M::mm_idx_gen::23868.514*5.64] sorted minimizers
[M::main::23868.514*5.64] loaded/built the index for 539089 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 304
[M::mm_idx_gen::28625.201*5.66] collected minimizers
[M::mm_idx_gen::28653.488*5.66] sorted minimizers
[M::main::28653.488*5.66] loaded/built the index for 452500 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 279
[M::mm_idx_gen::34400.000*5.67] collected minimizers
[M::mm_idx_gen::34413.762*5.67] sorted minimizers
[M::main::34413.762*5.67] loaded/built the index for 335275 target sequence(s)
[M::main] max occurrences of a minimizer to consider: 250
[M::main] Version: 0.2-r124-dirty
[M::main] CMD: ../minimap/minimap -Sw5 -L100 -m0 -t24 reads_S2Drd_065.fq reads_S2Drd_065.fq
[M::main] Real time: 40519.943 sec; CPU: 229370.769 sec
[M::main] ===> Step 1: reading read mappings <===
>/var/spool/torque/mom_priv/jobs/20160277.hpc-pbs2.hpcc.edu.SC: line 15: 45953 Segmentation fault      ../miniasm/miniasm -f reads_S2Drd_065.fq reads_S2Drd_065.paf.gz > asm_S2Drd_065.gfa

I had provided 576Gb of memory over 8 nodes. Earlier I tried with 516GB memory over 4 nodes (24 cores) and minimap step had finished successfully

vsuryaw avatar May 25 '16 18:05 vsuryaw

Minimap finished, but misasm segfaulted. How large is your dataset? Note that miniasm has not been optimized for huge genomes. It may take huge RAM. If you want to give it a try anyway, use:

miniasm -Rc2 -f reads.fa mapping.paf

-R triggers two-pass processing, which may save some RAM.

lh3 avatar May 25 '16 18:05 lh3

Thanks for your prompt response. I used the -Rc2 option but miniasm didn't recognize the option -R in the error log.

I increased the RAM upto 1.2TB spread over 8 nodes but I still get the segmentation fault. My genome is expected to be in the size of 750Mbp and repeats fraction could be from 50 to 60%.

What confounds me is that these are genomes from 3 closely related species of same genus (two wilds and one domesticated). Earlier (and now again) I have run miniasm successfully on one of these (domesticated species) with 512GB memory. The data was only slightly less (17Gb of gzipped fastq reads from domesticated species as opposed to 20-23Gb now for the wild species). Even after doubling the RAM, I get segmentation fault. Is it possible that something else is going wrong here? Many thanks!

vsuryaw avatar May 26 '16 16:05 vsuryaw