webchem
webchem copied to clipboard
Convert CAS to SMILES
Dear webchem developers, I am aware that converting CAS to SMILES is usually not that complicated. However, under the current circumstances I've spended quite some time on this problem without success so far, so I thought you might be able to help me out.
What I have tried
- Converting CAS to SMILES using the
cts_convert
function. This didn't return a result for any CAS I tried, so I visited the CTS Proxy showing me "Error 500 When calling /rest/to values" - Converting CAS to SMILES using ChemSpider i.e. the
cs_convert
function. At this ChemSpider Web Page I was able to turn a single CAS number into a molecule description including a SMILES code. I signed in to ChemSpider and created an API key to automate the process for multiple CAS numbers. However, the methodcs_convert
refused to accept the argumentfrom="CAS"
, yieldingError in match.arg(from, choices=valid) : 'arg' should be one of "csid", "inchikey", "inchi", "smiles", "mol"
. If ChemSpider generally supports conversion of CAS registry numbers, it would be a nice feature to extend thecs_convert
method to perform this conversion as well. - I was able to convert CAS to SMILES using CACTUS, but I couldn't find support for this API in
webchem
(or any other R package) and I would rather not rely on shell scripts as they are brittle and highly platform dependent. Did I probably overlook some existing support for CACTUS? - I tried the
ci_query
function to retrieve the SMILES code, which ran intoService not available. Returning NA
Any help is very appreciated
Hi @pstahlhofen, thanks for raising this issue.
-
cts_convert()
should be able to convert cas to smiles, yet id doesn't and other examples seem to be failing as well, will look into it, thanks for flagging. - While ChemSpider website supports many things, the APIs are more limited and last time I checked it did not offer conversions from/to CAS.
- CACTUS is supported through
cir_query()
and it seems to work! Example with ethanol:cir_query("64-17-5", from = "cas", to = "smiles")
- Example with ethanol works on my end:
ci_query("64-17-5", from = "rn")
returns a list, and usingsapply(<list>, function(x) x$smiles)
returns the smiles for the compound.
You can also use pubchem to get smiles from cas. Again for ethanol, get_cid("64-17-5", from = "xref/rn")
returns the CID of the compound, and then pc_sect(702, "canonical smiles")
returns the section "canonical smiles" from ethanol's pubchem page , https://pubchem.ncbi.nlm.nih.gov/compound/702#section=Canonical-SMILES
Let me know if these answer your question.
Also I'll keep this issue open until cts_convert()
is resolved.
Hi @stitam, thanks for the quick answer! cir_query
solved my problem :) See below for details
- Hmm, the HTTP-Status is OK but the strings in the result always seem to be empty.
- Alright
-
cir_query
works great! - Aha,
ci_query
works withfrom="rn"
but not withfrom="cas"
. Thanks for the example. If this is permanent, you might want to update the documentation onci_query
, where it says thatcas
is also supported.
get_cid("64-17-5", from = "xref/rn")
ran into Service not available
, so did get_cid("64-17-5", from = "xref/RN")
which is provided as an example in the docs.
It looks like CTS is down completely right now. Looks like someone has already opened an issue: https://bitbucket.org/fiehnlab/ctsproxy/issues/38/error-500
Looks like they haven't closed any issues in quite some time.
Related? https://github.com/ropensci/webchem/issues/257
Yes, I think so
To clarify, if I remember correctly cts_convert()
doesn't currently use CTS's REST API, because it was broken for some time. cts_convert()
uses a more web-scraping type approach, but #257 was a reminder to switch to using the REST API if it ever started working again. (Edit: I just checked and it's still broken over a year later because of an expired SSL certificate)
CTS has had a lot of issues in the past, probably because of all the API dependencies it has, and it might be worthwhile contacting someone at the Fiehn Lab to get an idea of their long-term goals for the project before putting any effort into changing/fixing cts_convert()
. If the Fiehn Lab isn't planning on maintaining CTS long term (e.g. because they don't have funding or staff), then it's maybe time to consider cts_convert()
soft deprecated / superseded.
Thanks @Aariq, that is correct, CTS REST API is not yet implemented in webchem. I contacted them last time the service was down, I'll contact them again, ask about their long-term goals and then we can decide..
Hi All,
Update on this issue: the service is back online, but queries are still not working as they used to.
This one works:
webchem::cts_convert("3380-34-5", "cas", "inchikey")
#> $`3380-34-5`
#> [1] "XEFQLINVKFYRCS-UHFFFAOYSA-N" "ZRWRPGGXCSSBAO-UHFFFAOYSA-N"
Created on 2021-11-25 by the reprex package (v2.0.1)
This one doesn't:
webchem::cts_convert("triclosan", "chemical name", "inchikey")
#> $triclosan
#> [1] NA
Created on 2021-11-25 by the reprex package (v2.0.1)