goatools icon indicating copy to clipboard operation
goatools copied to clipboard

Consider support for obographs json format

Open cmungall opened this issue 7 years ago • 6 comments

Note that we now have a proposed JSON representation of OBO that would obviate the need for special purpose parsers. Your comments as a developer would be most welcome:

  • https://github.com/geneontology/obographs/

See also this post describing motivation

  • https://douroucouli.wordpress.com/2016/10/04/a-developer-friendly-json-exchange-format-for-ontologies/

cmungall avatar Oct 31 '16 20:10 cmungall

@cmungall We will definitely add using the gene ontologies from a JSON representation. This is great news and should make the runs faster. Do you know when the Gene Ontology Consortium might have their gene ontologies available in JSON format? Is there anything we can do to help?

dvklopfenstein avatar Nov 07 '16 08:11 dvklopfenstein

We don't want to start making official releases yet, as we may want to change the format. For now there are converters, but this is not ideal for you as it involves an external dependency

cmungall avatar Nov 14 '16 17:11 cmungall

Thanks @cmungall . Yes, the external dependencies would be an issue for our users. We are very much looking forward to using the json files. Please let us know if you need any code, tests, or testing from us.

dvklopfenstein avatar Nov 14 '16 23:11 dvklopfenstein

FYI: Our speed tests show that reading GO DAGs from a JSON format is 287% faster than reading ASCII text from an obo file.

We ran the tests using the available go-plus.json which contains more extensive information than the smaller ASCII text go-basic.obo that is most frequently used in current GOEAs by the GOATOOLS user community. In our speed calculations, we accounted for the different different file sizes of go-plus.json and go-basic.obo due to the go-plus.json being information-rich, while the go-basic.obo is lean but extremely functional.

We look forward to supporting GO DAGs available in the JSON format.

dvklopfenstein avatar Jul 02 '17 03:07 dvklopfenstein

@cmungall

Haven't talked with you in a while since we wrote up the GOATOOLS paper.

I'm cleaning up some old issues in the repo... I just found that we had a dialog back then on supporting the GO DAG in JSON format. I wonder if this format is stable and your thoughts on whether you would like to see some support in goatools (Python GO library) or you have other libraries covering this format and we could simply import. cc @dvklopfenstein

tanghaibao avatar May 23 '21 21:05 tanghaibao

yes, it would be great to have support

the json format is simple and you should be able to use easily, but we will also have some autogenerated python classes for it soon

Also experimenting with some sql based representations that can efficiently be stored in sqlite/postgres, might be even faster to load from these

cmungall avatar May 25 '21 00:05 cmungall