fluff icon indicating copy to clipboard operation
fluff copied to clipboard

[feature] Add support for gtf in load_annotation()

Open shouldsee opened this issue 5 years ago • 1 comments

GTF is so widely used. Should be easily addable with BCBio.GFF

import Bio

from BCBio.GFF import GFFExaminer
import BCBio.GFF as bgff

import pymisca.util as pyutil
import pymisca.vis_util as pyvis
plt = pyvis.plt

def read_gtf(in_file, cache=1):
    if isinstance(in_file,str):
        in_handle = open(in_file)    
    else:
        in_handle = in_file
    it = bgff.parse(in_handle,)
    if cache:
        res = list( it )
        in_handle.close()
    else:
        res = it
    return res



def gene2transcript(g,force=0):
    '''Convert a SeqRecord from BCBio.GFF.parse() to a dictionary-like object
'''
    if isinstance(g,Bio.SeqRecord.SeqRecord):
        if not force:
            assert len(g.features)==1
        g = g.features[0]
        
    feats = g.sub_features
    d = {'parent':g}
    for i,f in enumerate(feats):
        if f.type in ['start_codon','stop_codon']:
            d[f.type]  = f
    return pyutil.util_obj(**d)

shouldsee avatar Dec 25 '18 17:12 shouldsee

Sorry! This took a while, as I was very busy. GTF/GFF is indeed widely used, however, especially for GFF there are many dialects that are subtly different. But I'll put it on the list of future enhancements.

simonvh avatar Feb 06 '19 08:02 simonvh