fluff
fluff copied to clipboard
[feature] Add support for gtf in load_annotation()
GTF is so widely used. Should be easily addable with BCBio.GFF
import Bio
from BCBio.GFF import GFFExaminer
import BCBio.GFF as bgff
import pymisca.util as pyutil
import pymisca.vis_util as pyvis
plt = pyvis.plt
def read_gtf(in_file, cache=1):
if isinstance(in_file,str):
in_handle = open(in_file)
else:
in_handle = in_file
it = bgff.parse(in_handle,)
if cache:
res = list( it )
in_handle.close()
else:
res = it
return res
def gene2transcript(g,force=0):
'''Convert a SeqRecord from BCBio.GFF.parse() to a dictionary-like object
'''
if isinstance(g,Bio.SeqRecord.SeqRecord):
if not force:
assert len(g.features)==1
g = g.features[0]
feats = g.sub_features
d = {'parent':g}
for i,f in enumerate(feats):
if f.type in ['start_codon','stop_codon']:
d[f.type] = f
return pyutil.util_obj(**d)
Sorry! This took a while, as I was very busy. GTF/GFF is indeed widely used, however, especially for GFF there are many dialects that are subtly different. But I'll put it on the list of future enhancements.