MoleculeSTM
MoleculeSTM copied to clipboard
Docker Failed (RDKit)
Issue: Docker fails at installing rdkit.
Attempted Workarounds: Tried removing the version of rdkit, also tried using pip install rdkit.
docker build -t molecule_stm .
[+] Building 40.7s (13/27) docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 1.39kB 0.0s
=> [internal] load metadata for nvcr.io/nvidia/pytorch:22.01-py3 1.6s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [ 1/24] FROM nvcr.io/nvidia/pytorch:22.01-py3@sha256:06f27ba669 0.0s
=> CACHED [ 2/24] RUN useradd -ms /bin/bash shengchaol 0.0s
=> CACHED [ 3/24] WORKDIR /home/shengchaol 0.0s
=> CACHED [ 4/24] RUN chmod -R 777 /home/shengchaol 0.0s
=> CACHED [ 5/24] RUN chmod -R 777 /usr/bin 0.0s
=> CACHED [ 6/24] RUN chmod -R 777 /bin 0.0s
=> CACHED [ 7/24] RUN chmod -R 777 /usr/local 0.0s
=> CACHED [ 8/24] RUN chmod -R 777 /opt/conda 0.0s
=> CACHED [ 9/24] RUN conda install -y python=3.7 0.0s
=> ERROR [10/24] RUN conda install -y -c rdkit rdkit=2020.09.1.0 39.1s
------
> [10/24] RUN conda install -y -c rdkit rdkit=2020.09.1.0:
1.165 Collecting package metadata (current_repodata.json): ...working... done
14.62 Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
30.82 Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source.
38.38
38.38 ResolvePackageNotFound:
38.38 - conda==4.11.0
38.38
------
Dockerfile:20
--------------------
18 | RUN conda install -y python=3.7
19 |
20 | >>> RUN conda install -y -c rdkit rdkit=2020.09.1.0
21 | # RUN pip install rdkit
22 | RUN conda install -y -c conda-forge -c pytorch pytorch=1.9.1
--------------------
ERROR: failed to solve: process "/bin/sh -c conda install -y -c rdkit rdkit=2020.09.1.0" did not complete successfully: exit code: 1
Hi @amelie-iska,
So this exception is about wrong version of rdkit. One solution is as follow:
- First you use this command to find the available rdkit versions:
conda search rdkit -c rdkit. Here is what I got (my current runnable env, not the one from Nvidia):
rdkit 2014.03.1 np18py27_3 rdkit
rdkit 2014.09.1 np19py26_1 rdkit
rdkit 2014.09.1 np19py27_1 rdkit
...
...
...
rdkit 2020.03.3.0 py37hc20afe1_1 rdkit
rdkit 2020.09.1.0 py36hd50e099_1 rdkit
rdkit 2020.09.1.0 py37hd50e099_1 rdkit
- Then you try other versions instead of
2020.09.1.0.
@amelie-iska I had the same issue. Changing most conda installs to pip installs fixed this for me. This is the docker file that worked:
FROM nvcr.io/nvidia/pytorch:22.01-py3 as base
#create a new new user
RUN useradd -ms /bin/bash shengchaol
# #change to this user
# USER shengchaol
#set working directory
WORKDIR /home/shengchaol
RUN chmod -R 777 /home/shengchaol
RUN chmod -R 777 /usr/bin
RUN chmod -R 777 /bin
RUN chmod -R 777 /usr/local
RUN chmod -R 777 /opt/conda
RUN conda install -y python=3.7
RUN pip install rdkit
RUN conda install -y -c conda-forge -c pytorch pytorch=1.9.1
RUN conda install -y -c pyg -c conda-forge pyg
RUN pip install requests
RUN pip install tqdm
RUN pip install matplotlib
RUN pip install spacy
# for SciBert
RUN pip install boto3
RUN pip install transformers
# for MoleculeNet
RUN pip install ogb
# install pysmilesutils
RUN python -m pip install git+https://github.com/MolecularAI/pysmilesutils.git
RUN pip install deepspeed
# install Megatron
RUN cd /tmp && git clone https://github.com/MolecularAI/MolBART.git --branch megatron-molbart-with-zinc && cd /tmp/MolBART/megatron_molbart/Megatron-LM-v1.1.5-3D_parallelism && pip install .
# install apex
RUN cd /tmp && git clone https://github.com/chao1224/apex.git
RUN cd /tmp/apex/ && pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
#expose port for Jupyter
EXPOSE 8888
Thanks @kosonocky any chance you've had any luck with ProteinDT (the other CLIP based model by @chao1224) as well?