stringtie icon indicating copy to clipboard operation
stringtie copied to clipboard

Stringtie skips pseudo genes from reference gtf

Open m-waqas opened this issue 3 years ago • 0 comments

I have mapped data using STAR and know trying to generate assembly using stringtie, the reference annotation gtf file contains 38464 genes and 47387 transcripts. When I tried to assemble just known genes (38464 genes) and transcripts (47387) using the following command:

stringtie -p 8 -e -B -G Genome_annotation/data.gtf -o Path_to_Assembly/GFP2/GFP2.gtf Path_to_Mapped_files/GFP2/GFP2Aligned.sortedByCoord.out.bam

Stringtie assign gene ids to around 23139 genes instead of 38464 genes and 473876 transcripts. Next I check the GENE TYPE of the ids around 15000 genes which were missed by stringtie and gene type of most (14977) of them is PSEUDO. Is it possible to quantify expression of all 38464 genes? or How can I quantify pseudo genes as well?

Any help will be highly appreciated.

m-waqas avatar Aug 09 '22 12:08 m-waqas