magicblast icon indicating copy to clipboard operation
magicblast copied to clipboard

Mistakes on small introns

Open sunriseTM opened this issue 1 year ago • 3 comments

Hi,magicblast take small introns (peaks at 23 bp) in my genome as deletions, and I cannot find any way to solve it, do you have any great suggestions?

sunriseTM avatar Mar 03 '23 06:03 sunriseTM

Hi @sunriseTM, I am sorry you ran into trouble. Can you post an example? It is difficult to say anything without looking at the data.

In general, Magic-BLAST is conservative about intron detection. Anything shorter than 10 bases is always reported as a deletion rather than an intron. Also common splice signals must be present for a an intron to be reported.

boratyng avatar Mar 03 '23 14:03 boratyng

Hi,I got a screenshot from IGV, as below: image the upper one is made by aligning PacBio Hifi mRNA reads (Iso-seq) using magicblast, and the one below is Illumina mRNA reads aligning result made by Hisat2. As you can see, the intron defined by Illumina reads was missed by magicblast, which I have encountered using minimap2 before. Except for wrong definition at the correct location, it will also get a deletion-intron shift, which means the wrong location, just as below: image I think it is a common difficulty for current softwares to align long reads to genome and recognize the correct intron structure.

sunriseTM avatar Mar 04 '23 06:03 sunriseTM

Thank you for the example. Yes, intron detection is generally more difficult with long reads, because of increased error rate. It looks like Magic-BLAST overextended here on one side and then could not find the splice signals because of this. Unfortunately I cannot offer you any parameter values to fix it right now. I can only offer that we fix this problem in a future release. Would you be able to share the reads that aligned in this region and the genome?

boratyng avatar Mar 07 '23 21:03 boratyng