gosling.js icon indicating copy to clipboard operation
gosling.js copied to clipboard

Support for GFF

Open manzt opened this issue 2 years ago • 4 comments

Several folks at ISMB asked about support for GFF for gene annotation data. It's well supported by many genome browsers and would be useful in Gos (probably indexed).

One thing I'm thinking about Gos is to automatically index indexable files for the visualization if they are missing.

manzt avatar Jul 13 '22 21:07 manzt

import gosling as gos

data = gos.gff3("./data.gff3") # runs data preparation scripts if necessary ?

manzt avatar Jul 13 '22 21:07 manzt

HiGlass has a repo for GFF Data Fetcher which uses @gmod/gff-js. But, it does not use index files and works only for small data (I was not able to display a ~20MB gz file).

JBrowse2 uses GFF3 with Tabix. We can find some ideas from their codes:

  • https://github.com/GMOD/jbrowse-components/blob/4a35910bd2759ae5aa6174a15498953ff3e17f16/plugins/gff3/src/Gff3TabixAdapter/Gff3TabixAdapter.ts
  • https://github.com/GMOD/tabix-js

sehilyi avatar Jul 15 '22 14:07 sehilyi

I think this is a great idea. I think we should require VCF & GFF to be indexed for this reason, since it is very inefficient to read these formats without indexing with TABIX. This is a requirement from Jbrowse docs that I think makes sense:

The following file formats are supported

  • Tabixed VCF
  • Tabixed BED
  • Tabixed GFF
  • BAM
  • CRAM
  • BigWig
  • BigBed
  • .hic file (Juicebox)
  • PAF

An idea is that Gos (since it has access to the system unlike the browser) could automatically index these formats with tabix if no index is found or provided.

manzt avatar Jul 15 '22 15:07 manzt

An idea is that Gos (since it has access to the system unlike the browser) could automatically index these formats with tabix if no index is found or provided.

I really like the idea of automatic generation of missing index files in Gos. We currently use index files for VCF (tabix -p vcf example.vcf.gz) and BAM (samtools index example.bam) data. I think we can similarly support Tabixed BED, as well as Tabixed GFF.

sehilyi avatar Jul 17 '22 17:07 sehilyi