Error: could not locate transcript ENST00000472017.1
Hello StringTie team,
I'm running into an issue where certain transcripts present in my BAM files are not appearing in the GTF output from StringTie. This causes errors when I try to generate transcript count matrices using prepDE.py, as it cannot locate these missing transcripts in some samples.
My Setup: I’m using StringTie to assemble transcripts and quantify expression from sorted BAM files, with the -G option pointing to a comprehensive GTF annotation file (from Gencode). For transcript quantification, I run StringTie with the parameters: stringtie -e $SORTED_BAM_FILE -o ${SAMPLE_NAME}.gtf -p $NUM_THREADS -G $GTF_FILE -A abundances.tab -C cov_refs.gtf -B
Error: could not locate transcript ENST00000697250.1 entry for sample OPL_B ## error from different run Error: could not locate transcript ENST00000607096.1 entry for sample CEXP_B ## error from different run
Are there specific StringTie parameters that would help ensure more consistent detection of transcripts across samples? Is there a recommended approach for cases where transcripts appear in BAM files but are missing in StringTie’s GTF output, especially for downstream differential expression analysis with prepDE.py?
Any insights or suggested settings would be much appreciated, as I’m aiming to achieve a comprehensive transcript count matrix compatible with DEseq2.
Thank you!
I have similar error.
Have you solved it?
I started using ballgown pipeline. Thanks
On Tue, Feb 11, 2025 at 9:43 AM starmoon66 @.***> wrote:
Have you solved it?
— Reply to this email directly, view it on GitHub https://github.com/gpertea/stringtie/issues/451#issuecomment-2651027946, or unsubscribe https://github.com/notifications/unsubscribe-auth/BMWEH4AU52FTQLQOVP6NAKD2PIECPAVCNFSM6AAAAABRXHTSYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNJRGAZDOOJUGY . You are receiving this because you authored the thread.Message ID: @.***>
--
Thank You With Regards
Anil Upreti, PhD Research Fellow Schepens Eye Research Institute, MEEI Harvard Medical School
Thank you~
It seems stringtie 3.0.0 generates GTF files containing varied numbers of rows among different samples even using a common merged guided GTF file for quantification. So there will be some transcripts that only exist in certain samples. I have downgraded to version 2.1.1 and the error goes away.
I am having the same error in v3.0.1
Error: could not locate transcript ENSMUST00000192299 entry for sample 254
Traceback (most recent call last):
File "/scratch/XX/XX/Workspace/XX/08_StringTie/quantification/prepDE.py3", line 282, in
I checked the t_data.ctab for this sample and the cov and FPKM are both "0.000000"