openff-toolkit icon indicating copy to clipboard operation
openff-toolkit copied to clipboard

`from_smiles` ignores mapped smiles

Open Yoshanuikabundi opened this issue 3 years ago • 3 comments

Describe the bug The Molecule.from_smiles() method silently ignores any atomic indexing/mapping present in the SMILES string.

To Reproduce

from openff.toolkit.topology.molecule import Molecule

unmapped_mol_from_smiles = Molecule.from_smiles("HOH")
mol_from_smiles = Molecule.from_smiles("[H:1][O:3][H:2]")
mol_from_mapped_smiles = Molecule.from_mapped_smiles("[H:1][O:3][H:2]")

>>> unmapped_mol_from_smiles.atoms
[Atom(name=, atomic number=1),
 Atom(name=, atomic number=8),
 Atom(name=, atomic number=1)]

>>>  mol_from_smiles.atoms
[Atom(name=, atomic number=1),
 Atom(name=, atomic number=8),
 Atom(name=, atomic number=1)]

>>> mol_from_mapped_smiles.atoms
[Atom(name=, atomic number=1),
 Atom(name=, atomic number=1),
 Atom(name=, atomic number=8)]

Computing environment (please complete the following information):

  • Arch Linux
Output of `conda list`
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
abseil-cpp                20210324.2           h9c3ff4c_0    conda-forge
alabaster                 0.7.12                     py_0    conda-forge
amberlite                 16.0                     pypi_0    pypi
ambertools                21.11            py39hc630cb1_0    conda-forge
amberutils                21.0                     pypi_0    pypi
amqp                      5.0.9              pyhd8ed1ab_0    conda-forge
anyio                     3.5.0            py39hf3d152e_0    conda-forge
argcomplete               2.0.0              pyhd8ed1ab_0    conda-forge
argon2-cffi               21.3.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0           py39h3811e60_1    conda-forge
arpack                    3.7.0                hdefa2d7_2    conda-forge
arrow-cpp                 6.0.1           py39h2c67c1e_5_cpu    conda-forge
asgiref                   3.5.0              pyhd8ed1ab_0    conda-forge
asttokens                 2.0.5              pyhd8ed1ab_0    conda-forge
astunparse                1.6.3              pyhd8ed1ab_0    conda-forge
attrs                     21.4.0             pyhd8ed1ab_0    conda-forge
autodoc-pydantic          1.6.1              pyhd8ed1ab_0    conda-forge
aws-c-auth                0.6.8                hadad3cd_1    conda-forge
aws-c-cal                 0.5.12               h70efedd_7    conda-forge
aws-c-common              0.6.17               h7f98852_0    conda-forge
aws-c-compression         0.2.14               h7c7754b_7    conda-forge
aws-c-event-stream        0.2.7               hd2be095_32    conda-forge
aws-c-http                0.6.10               h416565a_3    conda-forge
aws-c-io                  0.10.14              he836878_0    conda-forge
aws-c-mqtt                0.7.10               h885097b_0    conda-forge
aws-c-s3                  0.1.29               h8d70ed6_0    conda-forge
aws-c-sdkutils            0.1.1                h7c7754b_4    conda-forge
aws-checksums             0.1.12               h7c7754b_6    conda-forge
aws-crt-cpp               0.17.10              h6ab17b9_5    conda-forge
aws-sdk-cpp               1.9.160              h36ff4c5_0    conda-forge
babel                     2.9.1              pyh44b312d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
basis_set_exchange        0.9                pyhd8ed1ab_0    conda-forge
beautifulsoup4            4.10.0                   pypi_0    pypi
billiard                  3.6.4.0          py39h3811e60_1    conda-forge
black                     21.12b0            pyhd8ed1ab_0    conda-forge
bleach                    4.1.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.0               h9c3ff4c_0    conda-forge
boost                     1.74.0           py39h5472131_4    conda-forge
boost-cpp                 1.74.0               h359cf19_5    conda-forge
brotli                    1.0.9                h7f98852_6    conda-forge
brotli-bin                1.0.9                h7f98852_6    conda-forge
brotlipy                  0.7.0           py39h3811e60_1003    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2021.10.8            ha878542_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cachetools                5.0.0              pyhd8ed1ab_0    conda-forge
cairo                     1.16.0            ha00ac49_1009    conda-forge
celery                    5.1.2              pyhd8ed1ab_0    conda-forge
certifi                   2021.10.8        py39hf3d152e_1    conda-forge
cffi                      1.15.0           py39h4bc2ebd_0    conda-forge
charset-normalizer        2.0.10             pyhd8ed1ab_0    conda-forge
chemper                   1.0.0              pyh9f0ad1d_0    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
click-didyoumean          0.3.0              pyhd8ed1ab_0    conda-forge
click-option-group        0.5.3              pyhd8ed1ab_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
click-repl                0.2.0              pyhd8ed1ab_0    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
commonmark                0.9.1                      py_0    conda-forge
cryptography              36.0.1           py39h95dcef6_0    conda-forge
css-html-js-minify        2.5.5                    pypi_0    pypi
cudatoolkit               11.6.0               habf752d_9    conda-forge
curl                      7.81.0               h2574ce0_0    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
cython                    0.29.26          py39he80948d_0    conda-forge
dataclasses               0.8                pyhc8e2a94_3    conda-forge
debugpy                   1.5.1            py39he80948d_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
deprecated                1.2.13             pyh6c4a22f_0    conda-forge
doctest-oxide             0+untagged.11.gc938329          pypi_0    pypi
docutils                  0.17.1           py39hf3d152e_1    conda-forge
entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
executing                 0.8.2              pyhd8ed1ab_0    conda-forge
fastapi                   0.67.0             pyhd8ed1ab_0    conda-forge
fftw                      3.3.10          nompi_h77c792f_102    conda-forge
flit-core                 3.6.0              pyhd8ed1ab_0    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.13.1            hba837de_1005    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.28.5           py39h3811e60_0    conda-forge
forcebalance              1.9.2            py39h5472131_1    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
future                    0.18.2           py39hf3d152e_4    conda-forge
geometric                 0.9.7.2                    py_0    conda-forge
gettext                   0.19.8.1          h73d1719_1008    conda-forge
gflags                    2.2.2             he1b5a44_1004    conda-forge
glog                      0.5.0                h48cff8f_0    conda-forge
greenlet                  1.1.2            py39he80948d_1    conda-forge
grpc-cpp                  1.42.0               ha1441d3_1    conda-forge
h11                       0.12.0             pyhd8ed1ab_0    conda-forge
h2                        4.1.0            py39hf3d152e_0    conda-forge
h5py                      3.6.0           nompi_py39h7e08c79_100    conda-forge
hdf4                      4.2.15               h10796ff_3    conda-forge
hdf5                      1.12.1          nompi_h2750804_103    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
httpcore                  0.14.5             pyhd8ed1ab_0    conda-forge
httpx                     0.22.0           py39hf3d152e_0    conda-forge
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
icu                       69.1                 h9c3ff4c_0    conda-forge
idna                      3.3                pyhd8ed1ab_0    conda-forge
imagesize                 1.3.0              pyhd8ed1ab_0    conda-forge
importlib-metadata        4.10.1           py39hf3d152e_0    conda-forge
importlib_metadata        4.10.1               hd8ed1ab_0    conda-forge
importlib_resources       5.4.0              pyhd8ed1ab_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
ipykernel                 6.7.0            py39hef51801_0    conda-forge
ipython                   8.0.1            py39hf3d152e_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.6.5              pyhd8ed1ab_0    conda-forge
jbig                      2.1               h7f98852_2003    conda-forge
jedi                      0.18.1           py39hf3d152e_0    conda-forge
jinja2                    3.0.3              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
jsonschema                4.4.0              pyhd8ed1ab_0    conda-forge
jupyter_client            7.1.2              pyhd8ed1ab_0    conda-forge
jupyter_core              4.9.1            py39hf3d152e_1    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_widgets        1.0.2              pyhd8ed1ab_0    conda-forge
kiwisolver                1.3.2            py39h1a9c180_1    conda-forge
kombu                     5.2.3            py39hf3d152e_0    conda-forge
krb5                      1.19.2               hcc1bbae_3    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
lerc                      3.0                  h9c3ff4c_0    conda-forge
libblas                   3.9.0           13_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h7f98852_6    conda-forge
libbrotlidec              1.0.9                h7f98852_6    conda-forge
libbrotlienc              1.0.9                h7f98852_6    conda-forge
libcblas                  3.9.0           13_linux64_openblas    conda-forge
libcurl                   7.81.0               h2574ce0_0    conda-forge
libdeflate                1.8                  h7f98852_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libevent                  2.1.10               h9b69904_4    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 11.2.0              h1d223b6_12    conda-forge
libgfortran-ng            11.2.0              h69a702a_12    conda-forge
libgfortran5              11.2.0              h5c6108e_12    conda-forge
libglib                   2.70.2               h174f98d_1    conda-forge
libgomp                   11.2.0              h1d223b6_12    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0           13_linux64_openblas    conda-forge
libnetcdf                 4.8.1           nompi_hb3fd0d9_101    conda-forge
libnghttp2                1.43.0               h812cca2_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libprotobuf               3.19.3               h780b84a_0    conda-forge
libsass                   0.21.0                   pypi_0    pypi
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libssh2                   1.10.0               ha56f1ee_2    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_12    conda-forge
libthrift                 0.15.0               he6d91bd_1    conda-forge
libtiff                   4.3.0                h6f004c6_2    conda-forge
libutf8proc               2.7.0                h7f98852_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.2                h7f98852_1    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libxml2                   2.9.12               h885dcf4_1    conda-forge
libxslt                   1.1.33               h0ef7038_3    conda-forge
libzip                    1.8.0                h4de3113_1    conda-forge
libzlib                   1.2.11            h36c2ea0_1013    conda-forge
lxml                      4.7.1            py39h107f48f_0    conda-forge
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
markdown-it-py            2.0.1              pyhd8ed1ab_0    conda-forge
markupsafe                2.0.1            py39h3811e60_1    conda-forge
matplotlib-base           3.5.1            py39h2fa2bec_0    conda-forge
matplotlib-inline         0.1.3              pyhd8ed1ab_0    conda-forge
mdit-py-plugins           0.3.0              pyhd8ed1ab_0    conda-forge
mdtraj                    1.9.7            py39h138c130_1    conda-forge
mdurl                     0.1.0              pyhd8ed1ab_0    conda-forge
mistune                   0.8.4           py39h3811e60_1005    conda-forge
mmpbsa-py                 16.0                     pypi_0    pypi
mock                      4.0.3            py39hf3d152e_2    conda-forge
msgpack-python            1.0.3            py39h1a9c180_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
mypy_extensions           0.4.3            py39hf3d152e_4    conda-forge
myst-parser               0.16.1             pyhd8ed1ab_0    conda-forge
nbclient                  0.5.10             pyhd8ed1ab_1    conda-forge
nbconvert                 6.4.0            py39hf3d152e_0    conda-forge
nbformat                  5.1.3              pyhd8ed1ab_0    conda-forge
ncurses                   6.3                  h9c3ff4c_0    conda-forge
nest-asyncio              1.5.4              pyhd8ed1ab_0    conda-forge
netcdf-fortran            4.5.4           nompi_h2b6e579_100    conda-forge
networkx                  2.6.3              pyhd8ed1ab_1    conda-forge
nglview                   3.0.3              pyh8a188c0_0    conda-forge
nomkl                     1.0                  h5ca1d4c_0    conda-forge
notebook                  6.4.7              pyha770c72_0    conda-forge
numexpr                   2.8.0           py39hbd72853_100    conda-forge
numpy                     1.22.1           py39h91f2184_0    conda-forge
ocl-icd                   2.3.1                h7f98852_0    conda-forge
ocl-icd-system            1.0.0                         1    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openeye-toolkits          2021.2.0                 py39_0    openeye
openff-bespokefit         27rc4904+72.g60dd457           dev_0    
openff-forcefields        2.0.0              pyh6c4a22f_0    conda-forge
openff-fragmenter-base    0.1.2              pyhd8ed1ab_0    conda-forge
openff-qcsubmit           0.3.0              pyhd8ed1ab_0    conda-forge
openff-sphinx-theme       0.0.32+115.gcb16b0d          pypi_0    pypi
openff-toolkit-base       0.10.2             pyhd8ed1ab_0    conda-forge
openff-utilities          0.1.1              pyh6c4a22f_0    conda-forge
openjpeg                  2.4.0                hb52868f_1    conda-forge
openmm                    7.7.0            py39h792354b_0    conda-forge
openssl                   1.1.1l               h7f98852_0    conda-forge
orc                       1.7.1                h1be678f_1    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
packmol                   20.010               h86c2bf4_0    conda-forge
packmol-memgen            1.2.1rc0                 pypi_0    pypi
pandas                    1.4.0            py39hde0f152_0    conda-forge
pandoc                    2.17.0.1             h7f98852_0    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
parmed                    3.4.3            py39he80948d_1    conda-forge
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
pathspec                  0.9.0              pyhd8ed1ab_0    conda-forge
pcre                      8.45                 h9c3ff4c_0    conda-forge
pdb4amber                 20.1                     pypi_0    pypi
perl                      5.32.1          1_h7f98852_perl5    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    8.4.0            py39ha612740_0    conda-forge
pint                      0.18               pyhd8ed1ab_0    conda-forge
pip                       22.0.2             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
platformdirs              2.3.0              pyhd8ed1ab_0    conda-forge
plotly                    5.5.0              pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0            py39hf3d152e_2    conda-forge
prometheus_client         0.12.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.24             pyha770c72_0    conda-forge
prompt_toolkit            3.0.24               hd8ed1ab_0    conda-forge
psutil                    5.9.0            py39h3811e60_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
py                        1.11.0             pyh6c4a22f_0    conda-forge
py-cpuinfo                8.0.0              pyhd8ed1ab_0    conda-forge
pyarrow                   6.0.1           py39hff6fa39_5_cpu    conda-forge
pycairo                   1.20.1           py39hedcb9fc_1    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pydantic                  1.9.0            py39h3811e60_0    conda-forge
pygments                  2.11.2             pyhd8ed1ab_0    conda-forge
pymbar                    3.0.5            py39hce5d2b2_3    conda-forge
pyopenssl                 21.0.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.7              pyhd8ed1ab_0    conda-forge
pyrsistent                0.18.1           py39h3811e60_0    conda-forge
pysocks                   1.7.1            py39hf3d152e_4    conda-forge
pytables                  3.6.1            py39h2669a42_5    conda-forge
pytest                    6.2.5            py39hf3d152e_2    conda-forge
python                    3.9.10          h85951f9_1_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-slugify            5.0.2                    pypi_0    pypi
python_abi                3.9                      2_cp39    conda-forge
pytraj                    2.0.6                    pypi_0    pypi
pytz                      2021.3             pyhd8ed1ab_0    conda-forge
pyyaml                    6.0              py39h3811e60_3    conda-forge
pyzmq                     22.3.0           py39h37b5a0c_1    conda-forge
qcelemental               0.24.0             pyhd8ed1ab_0    conda-forge
qcengine                  0.22.0             pyhd8ed1ab_0    conda-forge
qcportal                  0.15.7             pyhd8ed1ab_0    conda-forge
rdkit                     2021.09.4        py39hccf6a74_0    conda-forge
re2                       2021.11.01           h9c3ff4c_0    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
redis-py                  4.1.2              pyhd8ed1ab_0    conda-forge
redis-server              6.2.6                h9b69904_0    conda-forge
regex                     2022.1.18        py39h3811e60_0    conda-forge
reportlab                 3.5.68           py39he59360d_1    conda-forge
requests                  2.27.1             pyhd8ed1ab_0    conda-forge
rfc3986                   1.5.0              pyhd8ed1ab_0    conda-forge
rich                      11.1.0             pyhd8ed1ab_0    conda-forge
s2n                       1.3.0                h9b69904_0    conda-forge
sander                    16.0                     pypi_0    pypi
scipy                     1.7.3            py39hee8e79c_0    conda-forge
send2trash                1.8.0              pyhd8ed1ab_0    conda-forge
setuptools                60.5.0           py39hf3d152e_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smirnoff99frosst          1.1.0              pyh44b312d_0    conda-forge
snappy                    1.1.8                he1b5a44_3    conda-forge
sniffio                   1.2.0            py39hf3d152e_2    conda-forge
snowballstemmer           2.2.0              pyhd8ed1ab_0    conda-forge
soupsieve                 2.3.1                    pypi_0    pypi
sphinx                    4.4.0              pyh6c4a22f_1    conda-forge
sphinx-click              3.0.3              pyhd8ed1ab_0    conda-forge
sphinxcontrib-applehelp   1.0.2                      py_0    conda-forge
sphinxcontrib-devhelp     1.0.2                      py_0    conda-forge
sphinxcontrib-htmlhelp    2.0.0              pyhd8ed1ab_0    conda-forge
sphinxcontrib-jsmath      1.0.1                      py_0    conda-forge
sphinxcontrib-qthelp      1.0.3                      py_0    conda-forge
sphinxcontrib-serializinghtml 1.1.5              pyhd8ed1ab_1    conda-forge
sqlalchemy                1.4.31           py39h3811e60_0    conda-forge
sqlite                    3.37.0               h9cd32fc_0    conda-forge
stack_data                0.1.4              pyhd8ed1ab_0    conda-forge
starlette                 0.14.2             pyhd8ed1ab_0    conda-forge
tenacity                  8.0.1              pyhd8ed1ab_0    conda-forge
terminado                 0.12.1           py39hf3d152e_1    conda-forge
testpath                  0.5.0              pyhd8ed1ab_0    conda-forge
text-unidecode            1.3                      pypi_0    pypi
tk                        8.6.11               h27826a3_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tomli                     1.2.2              pyhd8ed1ab_0    conda-forge
tornado                   6.1              py39h3811e60_2    conda-forge
torsiondrive              1.1.0              pyhd8ed1ab_0    conda-forge
tqdm                      4.62.3             pyhd8ed1ab_0    conda-forge
traitlets                 5.1.1              pyhd8ed1ab_0    conda-forge
typed-ast                 1.5.2            py39h3811e60_0    conda-forge
typing-extensions         4.0.1                hd8ed1ab_0    conda-forge
typing_extensions         4.0.1              pyha770c72_0    conda-forge
tzdata                    2021e                he74cb21_0    conda-forge
unidecode                 1.3.2                    pypi_0    pypi
urllib3                   1.26.8             pyhd8ed1ab_1    conda-forge
uvicorn                   0.17.1           py39hf3d152e_0    conda-forge
vine                      5.0.0              pyhd8ed1ab_1    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
widgetsnbextension        3.5.2            py39hf3d152e_1    conda-forge
wrapt                     1.13.3           py39h3811e60_1    conda-forge
xmltodict                 0.12.0                     py_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.7.2                h7f98852_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h7f98852_1    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-libxt                1.2.1                h7f98852_2    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xtb                       6.4.1                hf06ca72_2    conda-forge
xtb-python                20.2             py39h3811e60_4    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zeromq                    4.3.4                h9c3ff4c_1    conda-forge
zipp                      3.7.0              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h36c2ea0_1013    conda-forge
zstd                      1.5.2                ha95c52a_0    conda-forge

Additional context from_mapped_smiles() exists, but my impression is that it's there to require a mapping (and raise an error if its incomplete). I think from_smiles() should make a best effort attempt to get any mapped atoms in the right spot. I'm not sure its obvious that from_smiles won't do this.

Also, BespokeFit (and possibly other downstream projects) allows users to specify a molecule with SMILES. I can imagine users of BespokeFit wanting to specify exact atom ordering, and being surprised when their mapped smiles neither raises an error nor reorders the atoms.

Yoshanuikabundi avatar Jan 31 '22 07:01 Yoshanuikabundi

Ah, I think this is a documentation(/"it does what Jeff thinks it should do, he just didn't tell anyone what that is") problem. It's hard to specify "correct" behavior here because atom indices in SMILES don't have a universal meaning. Molecule.from_smiles actually DOES read and store the map indices, it just considers them cosmetic.

from openff.toolkit.topology.molecule import Molecule

mol_from_smiles = Molecule.from_smiles("[H:1][O:3][H:2]")

print(mol_from_smiles.properties['atom_map'])

{0: 1, 1: 3, 2: 2}

This distinction in the API keeps us accepting of all input without needing to deal with a zillion corner cases (what would we do with the perfectly valid SMILES [H:1][H:1] or [H:5][H]?)

Do you have a sense for whether it'd be better to handle this on a docs or a programming level?

(possibly related: I kinda tried to figure out the "common" meaning of map indices in #522 but ran out of organizational bandwidth to carry it forward. )

j-wags avatar Feb 02 '22 21:02 j-wags

Another user encountered this recently (https://github.com/openmm/spice-dataset/issues/16), @Yoshanuikabundi do you think we can make it more explicit in the documentation. Another random thought, does adding an extra argument mapped=True, similar to to_smiles(mapped=True), and pointing it to from_mapped_smiles() internally work as an alternative?

pavankum avatar Feb 09 '22 15:02 pavankum

I've also been bit by this - not resulting in broken production, but resulting in me looking goofy passing mapped SMILES to Molecule.from_smiles. I didn't know from_mapped_smiles exists and wouldn't have come to know it if Jeff didn't catch my error and point me to it - it would have been a silent error forever if I was an independent downstream developer or scientist. I don't think the current implementation addresses this - both because the stored atom maps are separated from the ordering of the atoms in the molecule and because I wouldn't have known they were there. (In all honesty I might not have pulled up the docs, but this behavior is not there/not visible.) Being a silent error/gotcha, I think a behavior change is justified alongside documentation changes.

I'd be naively in favor of any of the following solutions:

  • Molecule.from_smiles checks to see if the SMILES is mapped as delegates to the the current functionality of Molecule.from_mapped_files, erroring out if it runs into confusing corner cases
  • Molecule.from_smiles checks to see if the SMILES is mapped as delegates to the the current functionality of Molecule.from_mapped_files, bailing out and processing it as un-mapped SMILES with a warning if something goes wrong while processing it as a mapped SMILES
  • Molecule.from_smiles accepts mapped=True (default False) as a named argument, consolidating behavior in one method in a way that's similar to Molecule.to_smiles

mattwthompson avatar Feb 09 '22 15:02 mattwthompson

Ran into this footgun again - okay sure, I should know better, but also I don't think silently ignoring or transforming information is how we should be treating inputs.

ipdb> Molecule.from_smiles("[H:2][O:1][H:3]").atoms
[Atom(name=, atomic number=1), Atom(name=, atomic number=8), Atom(name=, atomic number=1)]
ipdb> Molecule.from_smiles("[H:1][O:3][H:2]").atoms
[Atom(name=, atomic number=1), Atom(name=, atomic number=8), Atom(name=, atomic number=1)]

If the concern is handling ambiguous or ill-defined cases, why can't we just error out? I think that's what error handling is for

mattwthompson avatar Oct 17 '23 19:10 mattwthompson

If the concern is handling ambiguous or ill-defined cases, why can't we just error out?

It's quite common to have fully mapped SMILES, where the map indices mean something other than "put the atoms in this order in a connection table" so raising an error would go too far. This does seem to be a really frequent footgun, though, even for experienced users. Would adding a warning be a good solution?

j-wags avatar Oct 17 '23 20:10 j-wags

A warning would be great - until #522 is implemented I think the toolkit should be loud about what it does and does not handle

mattwthompson avatar Oct 17 '23 20:10 mattwthompson

Cool - I'll accept a PR that emits a warning when a fully mapped SMILES is passed to from_smiles!

j-wags avatar Oct 17 '23 20:10 j-wags

I've been bitten by this so many times! Thank you!

jchodera avatar Oct 18 '23 02:10 jchodera

#1474 and should land in 0.14.5 or 0.15.0

mattwthompson avatar Oct 18 '23 15:10 mattwthompson