bioperl-live
bioperl-live copied to clipboard
Bio::Tools::GFF _gffX_string update
Hi, I would like to push some updates about the methods _gff2_string
and _gff25_string
to remove some inconsistency related to the format specifications (Here a review of the specifications I have done).
Currently the difference between the two methods lies in the fact that Target
attribute are put first in the attribute list using _gff25_string
.
point 1) As the order shouldn't matter I was wondering if we could remove the attribute sorting. The code is quite old (2004). I'm sceptical due to a comment saying # need to put the target info before other tag/value pairs - mw, and because the description of the _gff25_string
method says: Function: To get a format of GFF that is peculiar to Gbrowse/Bio::DB::GFF. But why having a general method handling a specific case for Gbrowse then? I guess Gbrowse has fixed this peculiarity since then...
Both are giving attribute list like that (note the two spaces before the semicolon, one would be enough...):
tag1 "value 1" ; tag2 value2
The _gff2_string
method follows the GFF2 specification. About the attribute the specification says: From version 2 onwards, the attribute field must have an tag value structure following the syntax used within objects in a .ace file, flattened onto one line by semicolon separators.
**point 2) They do not ask to put spaces around the semicolon, should we remove them? **. I guess for avoiding potential compatibility issue it's easier to keep it like that...
The _gff25_string
is similar to _gff2_string
but should follow the GTF2 format. (GFF2.5 = GTF). In that sense, the attribute must looks like:
tag1 "value 1"; tag2 value2;
point 3) For me is the most important point, the _gff25_string method must create GTF2/GFF2.5 format and not do be a fix of the _gff2_string method to be adapted for peculiar GBrowse case.
I poke @fangly @bosborne @hyphaltip @cjfields because I have seen you have worked on that package at some point.
I will adapt my modifications according to your feedback. Best regards,
Jacques
@Juke34 Based on the documentation I think it would be good to have you involved with the GFF specification discussions, though those have gone a bit dormant in the last few years.
I'm all for updating to ensure the specifications are in place. @scottcain would you have any comments on the above, as it could affect GBrowse? Maybe it doesn't matter if everyone is moving to using JBrowse and/or Bio::DB::SeqFeature?