gffcompare
gffcompare copied to clipboard
Exon-level sensitivity is not 100%, but there are zero missed exons.
Here is my .stats file.
# gffcompare v0.11.7 | Command line was:
#gffcompare-0.11.7.Linux_x86_64/gffcompare -r gencode.v19.annotation.gtf -o gffcompare_stringtie_merged stringtie_merged.gtf
#
#= Summary for dataset: stringtie_merged.gtf
# Query mRNAs : 612833 in 89388 loci (566229 multi-exon transcripts)
# (29760 multi-transcript loci, ~6.9 transcripts per locus)
# Reference mRNAs : 194187 in 54800 loci (169307 multi-exon)
# Super-loci w/ reference transcripts: 46595
#-----------------| Sensitivity | Precision |
Base level: 100.0 | 27.6 |
Exon level: 84.2 | 46.6 |
Intron level: 99.8 | 51.1 |
Intron chain level: 99.9 | 29.9 |
Transcript level: 98.2 | 31.1 |
Locus level: 95.6 | 50.9 |
Matching intron chains: 169199
Matching transcripts: 190667
Matching loci: 52414
Missed exons: 0/559962 ( 0.0%)
Novel exons: 275572/1115730 ( 24.7%)
Missed introns: 639/343915 ( 0.2%)
Novel introns: 137224/671588 ( 20.4%)
Missed loci: 0/54800 ( 0.0%)
Novel loci: 42793/89388 ( 47.9%)
Total union super-loci across all input datasets: 89388
612833 out of 612833 consensus transcripts written in gffcompare_stringtie_merged.annotated.gtf (0 discarded as redundant)
I see several inconsistencies here.
- The number of missed exons is zero, but the exon-level sensitivity is only 84.2%, not 100%.
- Among the 1115730 query exons, only 24.7% are novel, but the exon-level precision is 46.6%, not 75.3% (100% - 24.7%).
- Among the 671588 query introns, only 20.4% are novel, but the intron-level precision is 51.1%, not 79.6% (100% - 20.4%).
- The number of missed loci is zero, but the locus-level sensitivity is 95.6%, not 100%.
- Among the 89388 query loci, only 47.9% were novel, but the locus-level precision is 50.9%, not 52.1% (100% - 47.9%).
Am I misunderstanding something here?