IsoQuant icon indicating copy to clipboard operation
IsoQuant copied to clipboard

The issue of matching between the expanded GTF and transcript types

Open biochristmas opened this issue 1 year ago • 2 comments

Hi, I aligned the transcript to the reference genome, and based on the coverage of reads, I found two alternative splicing events. However, the GTF file expanded by IsoQuant shows only one transcript when viewed through IGV. Thank you! igv

biochristmas avatar May 27 '24 13:05 biochristmas

Dear @biochristmas

This two isoforms are quite hard to distinguish, since the second one is the a substing of the first one. IsoQuant takes into account that some reads can be truncated, and thus considers reads from a shorter isoform simply as truncated versions of the longer isoform. If the difference was on the 3' end and two isoforms had distinct polyA sites, there would be a higher chance of detecting both of them.

We are working on improving the algorithms for correctly detecting 5' and 3' ends, but this case seems quite non-trivial. Using such reads in other cases may lead to a high number of false positives.

You may try using --fl_data option, but I don't think it will make a difference in this case.

Best Andrey

andrewprzh avatar Jun 06 '24 15:06 andrewprzh

Thank you for your reply. I also tried the '--fl_data' parameter today, and the number of transcripts in the GTF file is the same as when not using '--fl_data'. There is indeed no difference.

biochristmas avatar Jun 06 '24 15:06 biochristmas