mmr
mmr copied to clipboard
Sorted BAM causes core dump
Hi, I am encountering a problem where MMR core dumps when processing a BAM file that is correctly sorted by read ID:
$ uname -a Linux scc4 2.6.32-573.18.1.el6.x86_64 #1 SMP Tue Feb 9 22:46:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux $ mmr -o sorted_mmr.bam -b -p sorted.bam The given input file is most probably not sorted by read-ID! Bailing out. mmr: OnlineData.cpp:323: void OnlineData::process_data_online(GeneralData*): Assertion `best_found' failed. Aborted (core dumped)
However, I ran samtools sort -n on the input file. This BAM file has both single and paired end reads in it, but that might not be the problem because I have a different sample with mixed reads in the same way that seems to work fine.
Here is a link to the input BAM: http://expirebox.com/download/62e9ee1f597c5f2516609b51d3324cb4.html
Just in case it's useful, here's one of the core dumps as well: http://expirebox.com/download/32e76753de95f1f8c4dc0604af41d041.html
Note these links expire in two days (on 2016-07-29), so if you need me to upload them again let me know.
Thanks.
Did you ever resolve this? Having a similar issue.
Same here, neither samtools sort -n or sambamba sort -n works on my bam files. :(
The problem seems to be when using the default prefiltering alignment mode when there are orphaned mate pair reads. I couldn't figure out how to fix it, but I did learn that turning off the prefiltering with the -f command line flag allows the program to run.
I encountered the same problem. It works for me when the BAM file coming from 35-bp read alignment, but not for BAM file coming from 100-bp read alignment. I think it might be because there are much fewer multi-mappers when read length is 100 in the first N lines of BAM file.
My error message is as follows:
Parsing segment boundaries from annotation file: /home/hl84w/work/mccb/genome/Homo_sapiens/human_gencode_v34/gencode.v34.primary_assembly.annotation.gtf
The given input file is most probably not sorted by read-ID! Bailing out.
The given input file is most probably not sorted by read-ID! Bailing out.
mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion best_found' failed. The given input file is most probably not sorted by read-ID! Bailing out. The given input file is most probably not sorted by read-ID! Bailing out. The given input file is most probably not sorted by read-ID! Bailing out. mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion
best_found' failed.
mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion best_found' failed. mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion
best_found' failed.
mmr: OnlineData.cpp:236: void OnlineData::process_data_online(GeneralData*): Assertion `best_found' failed.
/home/hl84w/.lsbatch/1603981021.829206.1.shell: line 31: 33187 Aborted mmr -o $out/${names[$i]}.mmr.bam --threads 8 --verbose --best-only --annotation $gtf ${bams[$i]}
@haibol2016 If your reads are paired-end, I suspect it is because 100bp reads are more likely to have orphaned ends. If you filter your BAM file so that the alignments are all properly paired the problem might go away.