esm
esm copied to clipboard
Installation issue
Hello ESM team:
I am trying setup ESM on our HPC cluster, OS RHEL7.9. I tied the following: 1)create a conda environment esm2-1.0.3 conda create -n esm2-1.0.3 python=3.7.3
2)conda activate esm2-1.0.3
3)(esm2-1.0.3) [ryao@tdragon4 esm]$ pip install fair-esm
4)(esm2-1.0.3) [ryao@tdragon4 esm]$ pip install torch Installing collected packages: typing-extensions, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cublas-cu11, nvidia-cudnn-cu11, torch Successfully installed nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 torch-1.13.0 typing-extensions-4.4.0
5)(esm2-1.0.3) [ryao@tdragon4 esm]$ module load cuda11.2/toolkit/11.2.0 (esm2-1.0.3) [ryao@tdragon4 esm]$ which nvcc /cm/shared/apps/cuda11.2/toolkit/11.2.0/bin/nvcc (esm2-1.0.3) [ryao@tdragon4 esm]$ pip install fair-esm[esmfold] ...... Collecting deepspeed==0.5.9 Downloading deepspeed-0.5.9.tar.gz (510 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 510.3/510.3 kB 3.6 MB/s eta 0:00:00 Preparing metadata (setup.py) ... error error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully. ...... OSError: /rsrch3/home/itops/ryao/.conda/envs/esm2-1.0.3/lib/python3.7/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: symbol cublasLtGetStatusString, version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference [end of output]
note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed
May you please advise, what could be the problem? Please let me know if you need more information.
Thank you, Rong Yao