jbrowse icon indicating copy to clipboard operation
jbrowse copied to clipboard

Support GFF3 Gap attribute

Open nathanweeks opened this issue 6 years ago • 8 comments

It would be convenient if JBrowse could render spliced alignments from tabix-indexed GFF3 files using the CIGAR string in the GFF3 Gap attribute (e.g., produced from gmap's gff3_match_est format), rather than requiring a less-storage-efficient approach of having a separate GFF3 line for each aligned segment.

nathanweeks avatar Jan 25 '18 15:01 nathanweeks

You might also consider BigBed format which keeps all info on the same line. The gtf2bed script https://github.com/dasmoth/gtf2bed and this PR can go some ways towards accomplishing this https://github.com/GMOD/jbrowse/pull/944

cmdcolin avatar Jan 29 '18 22:01 cmdcolin

Hi @cmdcolin, thanks for the heads up regarding the forthcoming BigBed support in JBrowse. BED files appear to lack support for arbitrary key/value pairs like the GFF3 9th column, and our use case is to display interspecific gene model primary transcript alignments with their functional annotation.

Until GFF3 Gap attribute support is available in JBrowse, it seems like modifying our processing pipelines to handle SAM / output BAM might be the most realistic alternative.

nathanweeks avatar Feb 06 '18 19:02 nathanweeks

It's not necessarily true that BED file doesn't allow arbitrary key value pairs. The bed file "autosql" concept allows arbitrary information encoding, and this can be converted from other formats using the gtf2bed program mentioned before

See https://github.com/dasmoth/gtf2bed/blob/master/gencode.as as an example

cmdcolin avatar Feb 06 '18 19:02 cmdcolin

If it is easier to support GFF3 Gap, I am all for that too. I just thought I'd point out the possible alternative :)

cmdcolin avatar Feb 06 '18 19:02 cmdcolin

Thanks for the pointer regarding AutoSQL support in BigBed; I had missed that. Will https://github.com/GMOD/jbrowse/pull/944 add support for that as well?

nathanweeks avatar Feb 06 '18 19:02 nathanweeks

@nathanweeks now there is some basic autosql support :)

cmdcolin avatar Feb 16 '18 08:02 cmdcolin

@cmdcolin : cool, looking forward to trying it out!

nathanweeks avatar Feb 16 '18 13:02 nathanweeks

Just to add a resource link which may be useful:

  • https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md#the-gap-attribute

nathanhaigh avatar Jul 04 '18 23:07 nathanhaigh