go-ontology icon indicating copy to clipboard operation
go-ontology copied to clipboard

Add "magnesium cation" ChEBI ID to GO:0016851

Open jpquast opened this issue 2 years ago • 3 comments

I added CHEBI:39127 (magnesium cation) to GO:0016851 ("magnesium chelatase activity).

It would be great if this entry could be updated accordingly.

jpquast avatar Oct 11 '22 21:10 jpquast

@jpquast thank you for the submission! I'm curious about your use case, because we may be able provide many more such mappings to you. I have a script which uses the GO-to-Rhea cross-references to connect all such GO activities to ChEBI. Currently these aren't included in the ontology and are represented in a fairly complicated way to reflect the fact that the mapped Rhea reactions usually aren't directional. Would you be interested in that file?

We are working on ways to cleanly include all of this in the ontology, so that's why I'm wondering how you would use it. Are you using the go-plus file currently?

balhoff avatar Oct 12 '22 19:10 balhoff

@balhoff thank you very much for your comment! I am actually working on protein-metal interactions and to compile a ground truth list with all current knowledge I among others use GO annotations of proteins (I use the QuickGO database API to retrieve entries). I created a slims GO datasets that should (if I didn't miss any) contain all molecular function GO terms that are related to metal binding of a protein. If these terms contain a ChEBI ID I can easily draw conclusions about the related metal. I can also combine this with the other sources that I have, that also mainly annotate metal-binding based on ChEBI IDs. So in the end I can find the most significant metal ChEBI IDs for each protein (e.g. the protein is annotated to bind a divalent metal (CHEBI:60240) and there are GO terms indicating magnesium binding (CHEBI:39127), the most significant metal in this case is magnesium since it is a child term of divalent metal).

Anyway, I noticed that my list of relevant GO terms contains 290 entries that have a ChEBI ID mapped, but 150 miss a ChEBI ID annotation. So I wanted to start to map these entries with missing annotations at least for these metal related GO terms. This PR is just a test for one of these GO terms. But clearly something failed here so I might have misunderstood how I can correctly annotate them. But if I understand you correctly, you are working already on a way to include all ChEBI IDs related to Rhea IDs in the database?

jpquast avatar Oct 12 '22 20:10 jpquast

But clearly something failed here so I might have misunderstood how I can correctly annotate them.

The syntax is not quite right; it would be something more like:

relation: has_input CHEBI:39127 ! magnesium cation

but I think the editor group will need to come to an agreement on the design pattern for connecting these activities to chemicals. One issue is that they are usually mapped to undirected Rhea reactions, so we can't automatically specific whether a chemical is input vs. output, just "participant". Would conflating inputs and outputs be okay for your use case?

But if I understand you correctly, you are working already on a way to include all ChEBI IDs related to Rhea IDs in the database?

Yes, my script will extract the ChEBI ids from Rhea and relate them to the GO terms. But it is outputting some complicated OWL right now which won't be very user friendly for most. I could flatten it all to has participant relations from GO to ChEBI (all the components of the reaction).

balhoff avatar Oct 14 '22 20:10 balhoff

For me it does not matter if the ChEBI ID is an input or output. It is just enough to know that this molecule interacts with the annotated protein. So I would be very interested in the flattened list that maps GO terms to ChEBI IDs that you have.

Regarding including this information also in the database, I assume it could still take a while until it is implemented. I understand the problem with the undirected reaction but I guess in this case a "has_participant" annotation is the best possible annotation.

jpquast avatar Oct 15 '22 21:10 jpquast

@jpquast I made some progress on generating the simplified file of ChEBI participants. Do you prefer OBO format? Or do you work with OWL (or RDF)?

balhoff avatar Oct 26 '22 18:10 balhoff

@jpquast I made some progress on generating the simplified file of ChEBI participants. Do you prefer OBO format? Or do you work with OWL (or RDF)?

Actually this question isn't so relevant anymore; now we're thinking of including this directly into go-plus rather than as a separate file.

balhoff avatar Oct 26 '22 20:10 balhoff

@jpquast I added about 20,000 relations from GO to ChEBI in #24276. So GO:0016851 will now have these relations in the next release (CHEBI:18420 is magnesium(2+)):

[Term]
id: GO:0016851
relationship: RO:0000057 CHEBI:15377
relationship: RO:0000057 CHEBI:15378
relationship: RO:0000057 CHEBI:18420
relationship: RO:0000057 CHEBI:30616
relationship: RO:0000057 CHEBI:43474
relationship: RO:0000057 CHEBI:456216
relationship: RO:0000057 CHEBI:57306
relationship: RO:0000057 CHEBI:60492

Let me know if that will serve your purpose! In the meantime I'll close this PR since we're bringing in this info automatically.

balhoff avatar Nov 01 '22 22:11 balhoff

Dear @balhoff thanks a lot for including all these infos in the database! This certainly helps a lot!

jpquast avatar Nov 06 '22 17:11 jpquast