the-fair-cookbook icon indicating copy to clipboard operation
the-fair-cookbook copied to clipboard

Generating InChIKeys for IUPAC names

Open egonw opened this issue 2 years ago • 9 comments

Great! Now, to the actual tasks:

  • [x] identify author
  • [x] write abstract
  • [ ] agree with editors on abstract
  • [x] write recipe
  • [ ] identify reviewer
  • [ ] conduct review
  • [ ] incorporate reviewer's comments
  • [ ] publish recipe

egonw avatar Apr 27 '22 10:04 egonw

Why this recipe?

It is a step in the larger work described in https://github.com/FAIRplus/the-fair-cookbook/issues/396

Author

Egon

Abstract

IUPAC names describe chemical structures, but one chemical compound can have multiple (equivalent) IUPAC names. That makes IUPAC names unsuited as global unique identifier. The InChIKey addresses this issue. This recipe describes how cheminformatics can be used to translate IUPAC names into unique InChIKeys. It will use the OPSIN library and show how its website can be used and how the translation can be automated with Google Colab.

egonw avatar Apr 27 '22 10:04 egonw

Preview: https://egonw.github.io/cookbook-dev/content/recipes/infrastructure/iupac-names.html

egonw avatar Apr 27 '22 11:04 egonw

Hi @egonw , I like it! One thing, though: InChI-Keys are not unique, either, are they? There is at least the theoretical chance of key collision. But if that is pointed out as a caveat, I am more than happy! :)

robertgiessmann avatar Apr 28 '22 04:04 robertgiessmann

Maybe this might help in getting some content or directing the readers to this resource - http://inchi.info/inchikey_overview_en.html. It would be good to highlight that differences I think.

YojanaGadiya avatar Apr 28 '22 06:04 YojanaGadiya

@Grab(group='io.github.egonw.bacting', module='managers-inchi', version='0.0.42')
@Grab(group='io.github.egonw.bacting', module='managers-opsin', version='0.0.42')
@Grab(group='org.apache.logging.log4j', module='log4j-api', version='2.18.0')
@Grab(group='org.apache.logging.log4j', module='log4j-core', version='2.18.0')

workspaceRoot = "../ws"
opsin = new net.bioclipse.managers.OpsinManager(workspaceRoot);
inchi = new net.bioclipse.managers.InChIManager(workspaceRoot);

anInChI = inchi.generate(
  opsin.parseIUPACName("methane")
)
println "InChI: " + anInChI.value
println "InchIKey: " + anInChI.key

egonw avatar Jul 10 '22 19:07 egonw

Or in Python:

image

egonw avatar Jul 11 '22 05:07 egonw

@egonw @robertgiessmann @YojanaGadiya @tabbassidaloii

how far from PR?

Would you mind setting a milestone to this content and the associated isses (#396) ?

proccaserra avatar Jul 27 '22 20:07 proccaserra

before the end of August.

egonw avatar Jul 27 '22 20:07 egonw

My summer did not go as planned. But making the PR now.

egonw avatar Sep 28 '22 13:09 egonw