gffcompare icon indicating copy to clipboard operation
gffcompare copied to clipboard

Should I filter the transcript with class code "s"?

Open JD12138 opened this issue 1 year ago • 1 comments

Hi, I aimed to detect novel transcript with HIFI reads using stringtie2(guided mode). Then I compare the result to gencode(v44) annotation. But almost half of the novel transcripts are marked as "s". Should I discard these kind of transcripts? And which class codes could be considered as novel transcripts expect class code "u"? And I aligned the reads to GRCh38 with minimap2(newest version, recommend parameter). Thanks!

JD12138 avatar Jun 10 '24 10:06 JD12138

Is there a reason to discard novel transcripts? In my case, I need to build cell-type-specific transcriptome GTF annotations for alternative splicing analysis, so I would not discard any novel transcripts. From my understanding, there isn't a single code (e.g., "s" or "u") that exclusively indicates novel transcripts—code "j" could also represent novel transcripts.

Interpret the Results: In the .tmap file, each transcript is assigned a class code to indicate its relationship to the reference. Some of the class codes you might encounter include:

=: Exact match with a reference transcript. j: Novel isoform with at least one splice junction shared with a reference transcript. u: Intergenic transcript (no overlap with any reference transcript). x: Exonic overlap with reference transcript on the opposite strand. i: Fully contained within a reference intron (potential novel transcript).

santataRU avatar Oct 04 '24 12:10 santataRU