gffutils icon indicating copy to clipboard operation
gffutils copied to clipboard

db.parents(id) stops at level 2?

Open dariober opened this issue 2 years ago • 0 comments

I'm applying db.parents(id) to features that have up to 3 parents, e.g. the nesting is id -> protein_match -> mRNA -> gene. It appears that the 3rd level, gene, is not returned. Am I missing something? Here's an example:

import gffutils

txt="""\
chr1 AUGUSTUS gene 68330 73621 1 - . ID=g1903;
chr1 AUGUSTUS mRNA 68330 73621 1 - . ID=g1903.t1;Parent=g1903;
chr1 Pfam protein_match 73372 73618 1 - . ID=g1903.t1.d1;Parent=g1903.t1;
chr1 Pfam protein_hmm_match 73372 73618 1 - . ID=g1903.t1.d1.1;Parent=g1903.t1.d1;
"""

db = gffutils.create_db(txt.replace(' ', '\t'), ':memory:', from_string=True)

Show the features:

for x in db.all_features():
    print(x)

chr1 AUGUSTUS gene              68330 73621 1 - . ID=g1903;
chr1 AUGUSTUS mRNA              68330 73621 1 - . ID=g1903.t1;Parent=g1903;
chr1 Pfam     protein_match     73372 73618 1 - . ID=g1903.t1.d1;Parent=g1903.t1;
chr1 Pfam     protein_hmm_match 73372 73618 1 - . ID=g1903.t1.d1.1;Parent=g1903.t1.d1;

Now, db.parents('g1903.t1.d1.1') returns as parents protein_match and the mRNA, but not the gene:

pp = db.parents('g1903.t1.d1.1')

for p in pp:
    print(p)

chr1	AUGUSTUS	mRNA	68330	73621	1	-	.	ID=g1903.t1;Parent=g1903;
chr1	Pfam	protein_match	73372	73618	1	-	.	ID=g1903.t1.d1;Parent=g1903.t1;

Shouldn't gene also be retrieved as a parent? Thanks!

This is with gffutils 0.11.1

dariober avatar Jan 12 '23 14:01 dariober