IsoformSwitchAnalyzeR icon indicating copy to clipboard operation
IsoformSwitchAnalyzeR copied to clipboard

Get annotation from TxDb?

Open lvclark opened this issue 3 years ago • 9 comments

I work for a biotech center where we get a lot of projects in honeybee, so we have our own custom GTF and transcriptome for Apis mellifera that includes several common viral RNAs. (Very frequently we find that more than half the RNA from a bee sample is just virus! Somehow the bee is ok with that.) I kept getting errors when trying to import our GTF with importRdata. I tried debugging it by going into your package code, and I think part of the problem is from CDS and exons not always having a value in the transcript_id column, but instead just pointing back to the transcript using the Parent column. That still didn't fix it though, and eventually I gave up and used the GFF from NCBI. (There aren't annotated alternative isoforms for the viral RNAs anyway.)

I'm wondering if, instead of using rtracklayer::import to import a GTF to GRanges, importRdata or importGTF could use the TxDb class from the GenomicFeatures package. Functions from that package like transcriptsBy, exonsBy, and cdsBy could simplify a lot of the complex code that you have written. Import from GTF to TxDb is pretty straightforward with makeTxDbFromGFF. Moreover, those of us who use Bioconductor a lot might already have our annotations stored as TxDb, and it would be convenient to pass that object directly to importRdata rather than having to read the file again.

lvclark avatar Jul 09 '20 15:07 lvclark