mychem.info
mychem.info copied to clipboard
New Data Source: GSRS
URL: https://gsrs.ncats.nih.gov
It provides a downloadable .gsrs
file. And this file is essentially a compressed 7-zip file with a list of JSON objects.
NOTE: GSRS resource is likely a successor of the previous GINAS resource (https://ginas.ncats.nih.gov redirects to https://gsrs.ncats.nih.gov now). We can include both gsrs
and ginas
for now, and can remove ginas
when we don't need it any more.
The JSON object does not seem including inchi
or inchikey
field (smiles
field available though), still confirming it with the GSRS team.
We confirmed with the GSRS team that inchikey
was calculated based on the smiles
field.
The KNIME workflow has an example on how to calculate inchikeys from GSRS using RDKit nodes, if that helps. https://hub.knime.com/-/spaces/-/~8MCL_tgTaY7uA37U/current-state/
In this case, we might just use our existing mapping from smiles
to inchikey
at MyChem.info to get the inchikey
value as the primary _id
key.
@newgene each record signifies either a chemical
, concept
, polymer
, nucleic acid
, protein
, mixture
, substance group
, or diverse
. Only chemicals and polymers have SMILES and InChI. What should each record in our API signify?
PS this dashboard is useful for data exploration