ModelPolisher icon indicating copy to clipboard operation
ModelPolisher copied to clipboard

Spurious annotations

Open mephenor opened this issue 5 years ago • 4 comments

During a quick look through the models I found that H2O is annotated as hydroxide additionally, among other things. The question here is whether this is correct or this should be fixed. Problem can be reproduced by e.g. polishing iCHOv1.json from https://github.com/SBRG/bigg_models_data/tree/master/models.

We need to check whether similar things happen for other species/reactions/etc. and queries need to be adapted to be more restrictive or rewritten in a different. However this might be quite time consuming, as annotations would need to be checked manually for plausibility.

mephenor avatar Jan 27 '20 17:01 mephenor

Upon a bit of further investigation annotations for some species reference the same entity, but in different organisms and, as the code to retrieve a BiGGId from annotations currently cannot retrieve the correct compartment, also across different compartments in some cases.

Two things can be done here:

  • check if queries can be adjusted to only get annotations for the correct organism
  • check if the code to retrieve the BiGGId from annotations can somehow retrieve compartment information

mephenor avatar Mar 29 '20 03:03 mephenor

Hi @mephenor,

I think I'm late for the party.

I have found the same as you doing a small task to have the same IDs for different models. In the case of H2O and OH-, both have the same annotation (Also ammonia and ammonium). The problem is deeper when we consider that some annotations refer specifically to water or OH- (e.g. KEGG C00001 vs C01328) or unspecifically refer to both (e.g. XLYOFNOQVPJJNP-UHFFFAOYSA-M is the inchikey for both).

Additionally, some annotation refers erroneously to water (e.g. MNXM2 = OH-) or simply wrong, such as META:OXONIUM (OH3+).

OK... If you would like, we could collaborate to take a deeper look at the issue. Moreover, I would like to add that some models at BIGG have metabolites with the same ID, same name, same molecular formula, but different charges.

Best regards, Rodrigo

glucksfall avatar Mar 22 '21 22:03 glucksfall

Hi @glucksfall and sorry for the very late response, I started a new job, did not get the notification and haven't had that much time to look into this issue, so the whole Polisher is currently a bit stuck in limbo with this being the current major issue to block a new release.

I have not found a solution yet, however, regarding your observation:

Moreover, I would like to add that some models at BIGG have metabolites with the same ID, same name, same molecular formula, but different charges.

After having another look at the database, BIGG only seems to store charge information in the model_compartmentalized_component table, which then references the component table, where the bigg_id and name are stored. So the bigg_id actually does not discriminate between different charge states and the obvious solution would be to add a filter on the annotations obtained. However, this would require to resolve those and reliably retrieve their charge information.

mephenor avatar Jul 28 '21 08:07 mephenor

see here for a list of all the annotations that are added to a minimal water species

Schmoho avatar Jul 26 '22 17:07 Schmoho