genomation icon indicating copy to clipboard operation
genomation copied to clipboard

readTranscriptFeatures with GTF

Open igordot opened this issue 4 years ago • 6 comments

readTranscriptFeatures() reads a BED file. Gene features are commonly stored as a GTF file. Is there a way to import a GTF file in the proper format for annotateWithGeneParts()? There is a convenient gffToGRanges() function, but you still need to convert the resulting GRanges object to a GRangesList. Is there already a function for that?

igordot avatar Jun 10 '20 22:06 igordot

Hi @igordot , Yes, you are right, readTranscriptFeatures() doesn't support gff files. Either you can convert your gff file to a bed file, or what you might be interested in is to use first use gffToGRanges then manipulate the output GRanges object a bit to get the features you are interested in - promoters, exons, introns etc and then use annotateWithFeatures(), e.g.:

library(genomation)
data(cage)
gff.file = system.file('extdata/chr21.refseq.hg19.gtf', package='genomation')
gr21=gffToGRanges(gff.file)
# here you can manipulate the GRanges object, e.g.:
grl21=as(split(gr21, gr21$type), "GRangesList")
> annotateWithFeatures(cage, grl21)
summary of target set annotation with feature annotation:
Rows in target set: 2326
----------------------------
percentage of target elements overlapping with features:
       exon  stop_codon         CDS start_codon 
      29.45        0.47       13.54        1.46 

percentage of feature elements overlapping with target:
       exon  stop_codon         CDS start_codon 
      18.55        4.41       14.31       12.76 

Hope it helps, Kasia

katwre avatar Jun 15 '20 08:06 katwre

My concern with that approach is that it gives you different results than the BED file. I assume "chr21.refseq.hg19.bed" and "chr21.refseq.hg19.gtf" should be comparable.

igordot avatar Jun 15 '20 15:06 igordot

I am not sure what exactly are you asking me about. Are you asking me why a gtf and a bed file look differently? There are two different file formats, but they should be comparable, I don't know the details. If you are interested in annotating your regions of interest with exons, introns, and promoters (output regions of readTranscriptFeatures) and you are not sure how to get their coordinates from a gtf file then check out .e.g GenomicFeatures::makeTxDbFromGFF, rtracklayer::exonsBy and rtracklayer::intronsByTranscript, and promoters are just 1kb (by default in genomation) flanking regions around TSS.

katwre avatar Jun 15 '20 17:06 katwre

I understand that the BED and GTF files are different. I wanted to see if there was a way to achieve the same annotation results from both.

It sounds like it is possible, but requires a few extra steps, such as rtracklayer::exonsBy and rtracklayer::intronsByTranscript.

igordot avatar Jun 15 '20 17:06 igordot

I think following Kasia's suggestion would work. I personally do not have the code to do that. But the idea is you need to re-create a GRangesList object returned by the readTranscriptFeatures() using a bed file. You need to parse the GTF with the rtracklayer and/or genomicFeatures package functions. Extract exon, intron and promoter coordinates and arrange them in a GRangesList object that is similar to the object returned by our own function.

Best, Altuna

On Mon, Jun 15, 2020 at 7:11 PM igor [email protected] wrote:

I understand that the BED and GTF files are different. I wanted to see if there was a way to achieve the same annotation results from both.

It sounds like it is possible, but requires a few extra steps, such as rtracklayer::exonsBy and rtracklayer::intronsByTranscript.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BIMSBbioinfo/genomation/issues/192#issuecomment-644259530, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE32EP52VQEL72QV54OSXLRWZI33ANCNFSM4N22SQPQ .

al2na avatar Jun 15 '20 18:06 al2na

Thanks for clarifying. I was hoping all or some parts were already included in the package. It would be a nice feature to have.

igordot avatar Jun 15 '20 18:06 igordot