ols4
ols4 copied to clipboard
Disseminating ontology knowledge (download options)
This issue is a bit abstract at the moment but want to start the conversation.
I absolutely love the UniProt user interface. As a bioinformatician, I can browse the website using advanced search, find my results, and then once I have the table I have a Download button...
data:image/s3,"s3://crabby-images/e738b/e738b80d1be5210894a29b370dbc21be3dd51f2a" alt="Screenshot 2023-03-17 at 01 53 16"
in which I can select anything from Excel, to JSON, to CSV:
data:image/s3,"s3://crabby-images/3f24d/3f24df5ac72e75ba4b217a9adcb39b4d05a429fc" alt="Screenshot 2023-03-17 at 01 53 42"
I can select the exact fields I want to include in my download:
data:image/s3,"s3://crabby-images/9e701/9e70121c2b51ff837600c9689d37f460ea9bd3cb" alt="Screenshot 2023-03-17 at 01 54 05"
I know there are conflicting opinions about whether we should bring ontologies to bioinformaticians (who from my experience strongly prefer flat files) or whether they should come to us via semantically enabled APIs/toolkits. But it seems like a really easy win to provide the ability to download cross-sections of ontologies as flat files in different formats which can be dropped straight into an R script or loaded as a pandas dataframe.
We really need to go through some use cases, but I can imagine for instance:
- Downloading flattened hierarchies (a table of [class] [superclass]) based on a property (e.g. subClassOf or part_of) to allow child/parent/ancestor/descendant lookups
- Downloading specific branches of an ontology, or terms filtered by a property
- Pathing between ontologies (e.g. proteins encoded by genes linked to a disease)
- Selecting specific properties to include as columns in the output (e.g. xrefs from a specific database)