ols4
ols4 copied to clipboard
Disseminating ontology knowledge (download options)
This issue is a bit abstract at the moment but want to start the conversation.
I absolutely love the UniProt user interface. As a bioinformatician, I can browse the website using advanced search, find my results, and then once I have the table I have a Download button...
in which I can select anything from Excel, to JSON, to CSV:
I can select the exact fields I want to include in my download:
I know there are conflicting opinions about whether we should bring ontologies to bioinformaticians (who from my experience strongly prefer flat files) or whether they should come to us via semantically enabled APIs/toolkits. But it seems like a really easy win to provide the ability to download cross-sections of ontologies as flat files in different formats which can be dropped straight into an R script or loaded as a pandas dataframe.
We really need to go through some use cases, but I can imagine for instance:
- Downloading flattened hierarchies (a table of [class] [superclass]) based on a property (e.g. subClassOf or part_of) to allow child/parent/ancestor/descendant lookups
- Downloading specific branches of an ontology, or terms filtered by a property
- Pathing between ontologies (e.g. proteins encoded by genes linked to a disease)
- Selecting specific properties to include as columns in the output (e.g. xrefs from a specific database)
I dont think this is a bad idea! You can look at robot export as a good blueprint for this sort of thing: https://robot.obolibrary.org/export
Another option is https://github.com/biolink/kgx which has all edges and and nodes as separate files (KG style).
Or you can hop on the KG Hub bandwagon which is already doing all this processing: http://kghub.org/kg-obo/getting_started.html#download-ontologies-in-kgx-format
Not sure what is best, but a push towards a more standardised table structure could be nice. KGX files, for example, can be readily translated into knowledge graphs (TTL, neo4j) etc.
Thanks @matentzn ! Lots to think about.