mmr icon indicating copy to clipboard operation
mmr copied to clipboard

Sorted BAM causes core dump

Open adamlabadorf opened this issue 8 years ago • 5 comments

Hi, I am encountering a problem where MMR core dumps when processing a BAM file that is correctly sorted by read ID:

$ uname -a Linux scc4 2.6.32-573.18.1.el6.x86_64 #1 SMP Tue Feb 9 22:46:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux $ mmr -o sorted_mmr.bam -b -p sorted.bam The given input file is most probably not sorted by read-ID! Bailing out. mmr: OnlineData.cpp:323: void OnlineData::process_data_online(GeneralData*): Assertion `best_found' failed. Aborted (core dumped)

However, I ran samtools sort -n on the input file. This BAM file has both single and paired end reads in it, but that might not be the problem because I have a different sample with mixed reads in the same way that seems to work fine.

Here is a link to the input BAM: http://expirebox.com/download/62e9ee1f597c5f2516609b51d3324cb4.html

Just in case it's useful, here's one of the core dumps as well: http://expirebox.com/download/32e76753de95f1f8c4dc0604af41d041.html

Note these links expire in two days (on 2016-07-29), so if you need me to upload them again let me know.

Thanks.

adamlabadorf avatar Jul 27 '16 16:07 adamlabadorf

Did you ever resolve this? Having a similar issue.

mforde84 avatar Feb 01 '17 21:02 mforde84

Same here, neither samtools sort -n or sambamba sort -n works on my bam files. :(

fcgportal avatar Sep 13 '18 14:09 fcgportal

The problem seems to be when using the default prefiltering alignment mode when there are orphaned mate pair reads. I couldn't figure out how to fix it, but I did learn that turning off the prefiltering with the -f command line flag allows the program to run.

adamlabadorf avatar May 23 '20 12:05 adamlabadorf

I encountered the same problem. It works for me when the BAM file coming from 35-bp read alignment, but not for BAM file coming from 100-bp read alignment. I think it might be because there are much fewer multi-mappers when read length is 100 in the first N lines of BAM file.

My error message is as follows: Parsing segment boundaries from annotation file: /home/hl84w/work/mccb/genome/Homo_sapiens/human_gencode_v34/gencode.v34.primary_assembly.annotation.gtf The given input file is most probably not sorted by read-ID! Bailing out. The given input file is most probably not sorted by read-ID! Bailing out. mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion best_found' failed. The given input file is most probably not sorted by read-ID! Bailing out. The given input file is most probably not sorted by read-ID! Bailing out. The given input file is most probably not sorted by read-ID! Bailing out. mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion best_found' failed. mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion best_found' failed. mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion best_found' failed. mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion `best_found' failed. /home/hl84w/.lsbatch/1603981021.829206.1.shell: line 31: 33187 Aborted mmr -o $out/${names[$i]}.mmr.bam --threads 8 --verbose --best-only --annotation $gtf ${bams[$i]}

haibol2016 avatar Oct 29 '20 14:10 haibol2016

@haibol2016 If your reads are paired-end, I suspect it is because 100bp reads are more likely to have orphaned ends. If you filter your BAM file so that the alignments are all properly paired the problem might go away.

adamlabadorf avatar Oct 29 '20 21:10 adamlabadorf