gffutils icon indicating copy to clipboard operation
gffutils copied to clipboard

Value of Target attribute gains quotes when it shouldn't in round trip manipulation

Open photocyte opened this issue 10 months ago • 4 comments

Hi there,

I've loaded a GFF file into memory like this:

 db = gffutils.create_db(args.gff,":memory:", force=True,merge_strategy="create_unique")

If I do this:

for feature in db.all_features():
        feature.seqid = pepid_to_scafid[feature.seqid]
        feature.start = json_data['p2g'][str(feature.start)]
        feature.end = json_data['p2g'][str(feature.end)]
        ## Requires a patched gffutils to work with downstream tools, where the feature.py prints the Target attribute without quotes
        print(feature)

The printed Target attribute value gains quotes, when it shouldn't: example input:

12B1-Scaf17	Gene3D	protein_match	2346889	2347135	6.9e-15	+	.	ID="pep1__G3DSA:1.10.1200.10_40657_40739";date="11-10-2023";Target=pep1 40657 40739;Name="ACP39";status="T";Dbxref="InterPro:IPR036736"

example output:

12B1-Scaf17	Gene3D	protein_match	2346889	2347135	6.9e-15	+	.	ID="pep1-1__G3DSA:1.10.1200.10_40657_40739";date="11-10-2023";Target="pep1 40657 40739";Name="ACP39";status="T";Dbxref="InterPro:IPR036736"

The value of the Target attribute should not have quotes per the GFF3 spec: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md#:~:text=23%20.%20.%20.%20ID%3DMatch1-,%3BTarget%3D,-EST23%201%2021

Should I be exporting the features a different way, or is Target gaining quotes a bug within the __repr__ of the gffutils feature?

photocyte avatar Apr 05 '24 19:04 photocyte