biolink-model icon indicating copy to clipboard operation
biolink-model copied to clipboard

CHEBI mappings for biolink:SmallMolecule

Open balhoff opened this issue 2 years ago • 9 comments

Are there any mappings to CHEBI that could be added for small molecule? If there is no single parent term, are there a few more specific terms that could be added as narrow mappings?

balhoff avatar Sep 30 '21 15:09 balhoff

I thought we had requested small molecule from CHEBI and they had rejected it

However, I can't find record of it in the CHEBI tracker

We talk about it here https://github.com/geneontology/go-ontology/issues/14047

But no one went ahead and made the actual CHEBI request.

I think we should just request this from CHEBI. I think everyone would find this super-helpful. I know like many higher level categories it's hard to find the most perfect ontological definition and there will be edge cases, but perfection is the enemy of the good.

cmungall avatar Sep 30 '21 20:09 cmungall

It's certainly a useful term, though some complicating factors to consider:

(1) Many (but not all) sources classify small molecules as organic compounds, though I'm guessing Translator folks would want the class to to be broader than that (2) Many (but not all) sources classify small molecules as polyatomic compounds, though Translator folks have established that they want monoatomic entities (particularly ions, at least biologically relevant ones) included (3) ChEBI defines "molecule" with its narrow chemical definition, i.e., of being electrically neutral, though I'm pretty sure Translator folks would want the class to be broader than that, in which case CHEBI:small molecule would not be subsumed by CHEBI:molecule

So, to encompass these broader dimensions (i.e., organic and inorganic, polyatomic and monoatomic, electrically neutral and ionic), perhaps request and use small molecular entity?

mikebada avatar Oct 01 '21 03:10 mikebada

Great points! Can you comment on https://github.com/ebi-chebi/ChEBI/issues/4140

cmungall avatar Oct 01 '21 03:10 cmungall

Based on the latest response from the ChEBI developers, I don't think a small molecule term is coming anytime soon. Can someone with more chemistry expertise propose a collection of narrow mappings that we could put in Biolink model? Right now our Translator KP doesn't return anything for small molecule because it doesn't map into ChEBI.

balhoff avatar Oct 15 '21 14:10 balhoff

CHEBI.txt Attached is a file from Chris B. that shows mappings to CHEBI from node normalizer

sierra-moxon avatar Nov 09 '21 22:11 sierra-moxon

The latest Biolink has a narrow mapping from small molecule to CHEBI:59999 (chemical substance). This seems backwards. 'Small molecule' is more specific than 'chemical substance', right? In this case it would be a broad mapping.

balhoff avatar Jan 17 '23 16:01 balhoff

Hi @sierra-moxon (cc: @vdancik),

You'll note @balhoff's comment today. This is partly triggered from failed SRI Testing unit tests on MolePro, involving 'parent' category traversal via 'subclass_of' relationships in the OntologyKP.

I'm wondering how to best cope with this in SRI Testing, to reflect the 'Translator' reality of these relationships. Is this a data fix (to the OntologyKP), or would some kind of coding tweak in the SRI Unit test (somehow invoking the node normalizer somewhere?) or is some other kind of creative somersault needed elsewhere?

RichardBruskiewich avatar Jan 17 '23 22:01 RichardBruskiewich

CHEBI:'chemical substance' subsumes things like mixtures and minerals (and not most things we readily characterize as small molecules--most of what we think of as small molecules is subsumed by the sibling class CHEBI:'molecular entity'). Though I think it's debatable, characterizing mixtures and minerals as small molecules seems unintuitive to me, so I'd vote for removing this narrow mapping.

mikebada avatar Jan 18 '23 08:01 mikebada

Within Translator, not quite sure what chemical entities are mapped onto biolink:SmallMolecule.

I heard a rumour that we consider atoms and ions as being mapped on that category, which I suspect to be misleading and problematic: by definition, molecules are 2 or more atoms (covalently) bonded to one another.

The prefix 'small' is likely a bit arbitrary, except with reference to large biological polymers like nucleic acid chains (DNA, RNA), proteins or (??) 'large' lipids. That said, some bioactive nucleic acid chains and lipids may be relatively small(?).

I can't comment on how CHEBI handles all of this diversity - I'm not well acquainted with CHEBI (yet...)

RichardBruskiewich avatar Jan 19 '23 02:01 RichardBruskiewich