stringtie
stringtie copied to clipboard
Duplication of an exon is consider as an insertion
Hello !
I have a question about an interpretation done by stringtie, using Nanopore datas. I generated my bam files with minimap2 and used them as input of stringtie without a guided file with the followed command :
stringtie \
-L \
-o <my output file>
<my bam file>
On one sample we should observe an exon duplication. On IGV we see this duplication :

And with the following coordinated :

However using stringtie, duplication is consider as an insertion and merged with the exon annotated as 7 :
17 StringTie exon 41196305 41197819 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "1"; cov "57.179531";
17 StringTie exon 41199660 41199720 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "2"; cov "67.615112";
17 StringTie exon 41201138 41201211 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "3"; cov "68.242302";
17 StringTie exon 41203080 41203134 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "4"; cov "67.747482";
17 StringTie exon 41209069 41209152 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "5"; cov "69.329041";
17 StringTie exon 41215350 41215390 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "6"; cov "68.615166";
17 StringTie exon 41215891 41215968 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "7"; cov "68.967773";
17 StringTie exon 41219625 41219712 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "8"; cov "67.044159";
17 StringTie exon 41222945 41223255 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "9"; cov "77.162979";
17 StringTie exon 41226348 41226538 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "10"; cov "74.733871";
17 StringTie exon 41228505 41228631 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "11"; cov "71.012573";
17 StringTie exon 41234421 41234592 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "12"; cov "75.215317";
17 StringTie exon 41242961 41243049 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "13"; cov "74.153717";
17 StringTie exon 41246761 41246877 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "14"; cov "46.779739";
17 StringTie exon 41251697 41251897 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "15"; cov "82.633339";
17 StringTie exon 41256139 41256278 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "16"; cov "47.550884";
17 StringTie exon 41256885 41256973 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "17"; cov "47.054935";
17 StringTie exon 41258473 41258550 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "18"; cov "44.780148";
17 StringTie exon 41267743 41267796 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "19"; cov "45.616928";
17 StringTie exon 41276034 41276132 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "20"; cov "44.381065";
17 StringTie exon 41277288 41277540 1000 - . gene_id "STRG.9542"; transcript_id "STRG.9542.13"; exon_number "21"; cov "16.723976";
Is it a normal behavior ? Because I would expect as transcript annotation a duplication of coordinates around exon 7.
Thanks for your help !