modkit icon indicating copy to clipboard operation
modkit copied to clipboard

Segfault on the cliveome

Open mp15 opened this issue 2 years ago • 6 comments

Whilst running on the cliveome aligned with Dorado, mapped with minimap2 and sorted with samtools I have run into the following segfault. Any chance you can put out a version of your binary compiled with full debug symbols or should I compile from scratch and try and reproduce?

#0  __memmove_avx_unaligned () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:222
#1  0x00005555555dd498 in cram_to_bam.isra ()
#2  0x00005555555e97da in cram_get_bam_seq ()
#3  0x00005555555d4436 in sam_read1 ()
#4  0x00005555555d5548 in sam_readrec ()
#5  0x00005555555c111c in hts_itr_next ()
#6  0x00005555555d9a51 in bam_plp64_auto ()
#7  0x00005555555d9ad1 in bam_plp_auto ()
#8  0x000055555552b03a in <mod_kit::mod_pileup::PileupIter as core::iter::traits::iterator::Iterator>::next ()
#9  0x0000555555528d96 in _ZN7mod_kit8commands12ModBamPileup3run28_$u7b$$u7b$closure$u7d$$u7d$28_$u7b$$u7b$closure$u7d$$u7d$28_$u7b$$u7b$closure$u7d$$u7d$28_$u7b$$u7b$closure$u7d$$u7d$17hb169f6e83b14f0b3E.llvm.16552777208159181741 ()
#10 0x00005555555277fe in rayon::iter::plumbing::Folder::consume_iter ()
#11 0x00005555554f3501 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#12 0x000055555550d9a3 in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#13 0x00005555555118e4 in rayon_core::registry::in_worker ()
#14 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#15 0x000055555553d629 in _ZN83_$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$7execute17hcb306c6da184137aE.llvm.14917117514892184753 ()
#16 0x00005555554bcaa3 in rayon_core::registry::WorkerThread::wait_until_cold ()
#17 0x000055555550da5d in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#18 0x00005555555118e4 in rayon_core::registry::in_worker ()
#19 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#20 0x000055555553d629 in _ZN83_$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$7execute17hcb306c6da184137aE.llvm.14917117514892184753 ()
#21 0x00005555554bcaa3 in rayon_core::registry::WorkerThread::wait_until_cold ()
#22 0x000055555550da5d in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#23 0x00005555555118e4 in rayon_core::registry::in_worker ()
#24 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#25 0x000055555550d9a3 in _ZN10rayon_core4join12join_context28_$u7b$$u7b$closure$u7d$$u7d$17h64310596cc7337a5E.llvm.316613399766671356 ()
#26 0x00005555555118e4 in rayon_core::registry::in_worker ()
#27 0x00005555554f35f8 in rayon::iter::plumbing::bridge_producer_consumer::helper ()
#28 0x000055555553d629 in _ZN83_$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$7execute17hcb306c6da184137aE.llvm.14917117514892184753 ()
#29 0x00005555554bcaa3 in rayon_core::registry::WorkerThread::wait_until_cold ()
#30 0x00005555558ca9a8 in rayon_core::registry::ThreadBuilder::run ()
#31 0x00005555558d12ea in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#32 0x00005555558ce431 in core::ops::function::FnOnce::call_once{{vtable.shim}} ()
#33 0x0000555555943893 in alloc::boxed::{impl#45}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:1987
#34 alloc::boxed::{impl#45}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> () at library/alloc/src/boxed.rs:1987
#35 std::sys::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/unix/thread.rs:108
#36 0x00007ffff7d0bb43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#37 0x00007ffff7d9da00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

mp15 avatar May 03 '23 08:05 mp15

I've now managed to replicate this converting that cram to a bam with samtools proper so I'm raising this with the samtools team.

mp15 avatar May 03 '23 09:05 mp15

Hello @mp15,

Could you give me some more details on what's happening? If I'm understanding correctly the steps you've performed are:

  1. dorado basecall
  2. align with minimap2
  3. sort and index with samtools
  4. attempt pileup with modkit

But you get the above error at (4). Then you were able to get a similar error when using samtools to convert your CRAM file to a BAM file, correct? Do you think this is because the file is corrupted or there is an incompatibility with your system? It would be great of modkit could catch this kind of thing instead of causing this error.

ArtRand avatar May 03 '23 13:05 ArtRand

It's a funny one, and kinda low level so you probably can't catch it.

I got James Bonfield to have a look at the CRAM and he found the problem, I actually broke the fileformat. I had used --emit-moves in dorado in anticipation of doing duplex calling. When this is combined with minimap the large mv tags were replicated to each secondary mapping by minimap2. Secondary mappings don't have SEQ having a * instead. This means they don't count towards the 5mbases of sequences we keep in each CRAM container block. So when I merged the 3 promethion lanes of Cliveome a huge number of secondary mappings with their mv tags ended up in one container block making it larger than 2GB and overflowed the 32-bit signed int used to store size.

James is going to make a fix to htslib, I'll let you know when. You may wish to upgrade htslib when we do. In the mean time I'm going to tweak my workflow so I don't make such a silly result.

mp15 avatar May 03 '23 14:05 mp15

Interesting, thanks, keep me posted.

ArtRand avatar May 03 '23 14:05 ArtRand

Bug fix for htslib is in: samtools/htslib#1613. I'm going to ask Rob if we can make a release soon.

mp15 avatar May 04 '23 18:05 mp15

Great. It'll have to make it into rust-htslib also, but I can probably push on it once it's in mainline. Thanks for the update.

ArtRand avatar May 05 '23 15:05 ArtRand