stringtie icon indicating copy to clipboard operation
stringtie copied to clipboard

stringtie halts at a bundle that have large number of alignments and junctions

Open husensofteng opened this issue 1 year ago • 4 comments

I am running the latest release (2.2.3) on an ONT dataset that is mapped to GRCm39 using minimap2. stringtie halts at the following bundle regardless how much memory space I allocate. I wonder if you have any recommendations how to fix it.

bundle chr5:48485294-49946003 [821583 alignments (376236 distinct), 13558 junctions, 18 guides] begins processing...

Thanks a lot!

husensofteng avatar Oct 06 '24 12:10 husensofteng

@gpertea do you have any suggestions how to solve this issue? for the time being I am thinking to exclude the reads in this specific region to make it work but I hope there is wiser solution in the future releases

husensofteng avatar Oct 08 '24 07:10 husensofteng

Wow, this seems to be the Kcnip4 locus, notoriously complex, also full of repeat elements which complicates even the accuracy of the alignments in the region..

Sorry, at the moment I don't have any good solutions to this kind of super-dense long-reads complex cluster, and these days cannot find the time to dig deeper into such interesting cases (still focused on human short reads for my pressing work projects), but it is indeed an important issue to be studied and addressed in the future.

Indeed the workaround I can imagine now would be to exclude this particular region from a "routine" assembly process (and unfortunately Stringtie doesn't make that easy, you'll have to script your way around it), and, if you can afford it, take it aside and assemble it separately through a much more aggressive alignment filtering + more stringent assembly attempts that would limit the number of alignments (and splice sites) to consider (increase -j value quite a bit, try the -R option first etc.). But this might involve quite a bit of experimentation, which I would be glad to look into, but cannot afford at the moment.

gpertea avatar Oct 08 '24 12:10 gpertea

yes, I understand and thank you for the suggestions around this

husensofteng avatar Oct 10 '24 10:10 husensofteng

I am using StringTie through the nf-core/rnaseq pipeline and encountering an issue with one sample. The error message says “***buffer overflow detected *** stringtie terminated.” I’ve tried increasing the memory, but it hasn’t resolved the problem. Suggestions from the nf-core Slack channel indicate it could be a potential bug. Please let me know if there is a workaround. Here is the logfile. nextflow (2).log

drimran87 avatar Nov 10 '24 21:11 drimran87