stringtie icon indicating copy to clipboard operation
stringtie copied to clipboard

One gene outputs multiple TPM values

Open HengkuanLi opened this issue 3 years ago • 3 comments

When quantified with stringtie, multiple lines were output from a gene. By looking at the gtf file, I found that these were different transcripts of the gene. How do I solve this?

ENSSSCG00000035639 TERB1 6 - 27450632 27503893 45.755924 12.578593 26.445555 ENSSSCG00000035639 TERB1 6 - 27511466 27540118 30.928654 13.596822 28.586306 ENSSSCG00000003981 ZFP69B 6 - 170660884 170675879 5.751518 1.388807 2.926265 ENSSSCG00000003981 ZFP69B 6 - 170687418 170701255 35.762066 8.635393 18.195074

HengkuanLi avatar Oct 25 '22 15:10 HengkuanLi

I have the same problem.

LOC_Os10g31460 - Chr10 - 16494293 16495238 0.850951 0.184613 0.368143 LOC_Os10g31460 - Chr10 - 16486322 16492007 14.433779 6.385049 12.732656

Wangchangsh avatar Apr 15 '23 12:04 Wangchangsh

me too

zpliu1126 avatar Jan 17 '24 12:01 zpliu1126

This happens when your annotation file contains genes with non-overlapping transcripts. We consider this to be an annotation error. In this case we suggest to complete your downstream analysis by consider this two gene locations as distinct: i.e. you could label ENSSSCG00000035639 TERB1 6 - 27450632 27503893 45.755924 12.578593 26.445555 as ENSSSCG00000035639_1 and ENSSSCG00000035639 TERB1 6 - 27511466 27540118 30.928654 13.596822 28.586306 as ENSSSCG00000035639_2. Alternatively (not prefered) you could just remove the location with the smaller TPM.

mpertea avatar Jan 17 '24 18:01 mpertea