alphafold
alphafold copied to clipboard
alphafold doesn't work in RTX 4090?
environment: Win11+WSL2+Ubuntu22.04 Docker for wsl 4.25.0
NVIDIA 4090,cuda12.2 conda list:
# packages in environment at /home/a22/anaconda3/envs/alphafold2-docker:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main defaults
_openmp_mutex 5.1 1_gnu defaults
absl-py 1.0.0 pypi_0 pypi
bzip2 1.0.8 h7b6447c_0 defaults
ca-certificates 2023.08.22 h06a4308_0 defaults
certifi 2023.11.17 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
docker 5.0.0 pypi_0 pypi
idna 3.6 pypi_0 pypi
ld_impl_linux-64 2.38 h1181459_1 defaults
libffi 3.4.4 h6a678d5_0 defaults
libgcc-ng 11.2.0 h1234567_1 defaults
libgomp 11.2.0 h1234567_1 defaults
libstdcxx-ng 11.2.0 h1234567_1 defaults
libuuid 1.41.5 h5eee18b_0 defaults
ncurses 6.4 h6a678d5_0 defaults
openssl 3.0.12 h7f8727e_0 defaults
pip 23.3.1 py311h06a4308_0 defaults
python 3.11.5 h955ad1f_0 defaults
readline 8.2 h5eee18b_0 defaults
requests 2.31.0 pypi_0 pypi
setuptools 68.0.0 py311h06a4308_0 defaults
six 1.16.0 pypi_0 pypi
sqlite 3.41.2 h5eee18b_0 defaults
tk 8.6.12 h1ccaba5_0 defaults
tzdata 2023c h04d1e81_0 defaults
urllib3 2.1.0 pypi_0 pypi
websocket-client 1.6.4 pypi_0 pypi
wheel 0.41.2 py311h06a4308_0 defaults
xz 5.4.2 h5eee18b_0 defaults
zlib 1.2.13 h5eee18b_0 defaults
(alphafold2-docker) a22@C10H15N:~/alphafold$ docker-compose version
Docker Compose version v2.23.0-desktop.1
(alphafold2-docker) a22@C10H15N:~/alphafold$ bash run.sh
I1130 21:29:25.508691 140208603498304 run_docker.py:116] Mounting /home/a22/alphafold/project/6y4f -> /mnt/fasta_path_0
I1130 21:29:25.508810 140208603498304 run_docker.py:116] Mounting /home/a22/afdata/uniref90 -> /mnt/uniref90_database_path
I1130 21:29:25.508872 140208603498304 run_docker.py:116] Mounting /home/a22/afdata/mgnify -> /mnt/mgnify_database_path
I1130 21:29:25.508915 140208603498304 run_docker.py:116] Mounting /home/a22/afdata -> /mnt/data_dir
I1130 21:29:25.508964 140208603498304 run_docker.py:116] Mounting /home/a22/afdata/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir
I1130 21:29:25.509018 140208603498304 run_docker.py:116] Mounting /home/a22/afdata/pdb_mmcif -> /mnt/obsolete_pdbs_path
I1130 21:29:25.509066 140208603498304 run_docker.py:116] Mounting /home/a22/afdata/pdb70 -> /mnt/pdb70_database_path
I1130 21:29:25.509122 140208603498304 run_docker.py:116] Mounting /home/a22/afdata/uniref30 -> /mnt/uniref30_database_path
I1130 21:29:25.509181 140208603498304 run_docker.py:116] Mounting /home/a22/afdata/bfd -> /mnt/bfd_database_path
I1130 21:29:26.290893 140208603498304 run_docker.py:258] /sbin/ldconfig.real: /usr/lib/x86_64-linux-gnu/libcuda.so.1 is not a symbolic link
I1130 21:29:26.290986 140208603498304 run_docker.py:258]
I1130 21:29:31.820214 140208603498304 run_docker.py:258] I1130 13:29:31.819571 139803493142784 templates.py:858] Using precomputed obsolete pdbs /mnt/obsolete_pdbs_path/obsolete.dat.
I1130 21:29:34.677552 140208603498304 run_docker.py:258] I1130 13:29:34.676982 139803493142784 xla_bridge.py:353] Unable to initialize backend 'tpu_driver': NOT_FOUND: Unable to find driver in registry given worker:
I1130 21:29:35.058179 140208603498304 run_docker.py:258] I1130 13:29:35.057624 139803493142784 xla_bridge.py:353] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: Interpreter Host CUDA
I1130 21:29:35.058630 140208603498304 run_docker.py:258] I1130 13:29:35.058142 139803493142784 xla_bridge.py:353] Unable to initialize backend 'tpu': module 'jaxlib.xla_extension' has no attribute 'get_tpu_client'
I1130 21:29:35.058690 140208603498304 run_docker.py:258] I1130 13:29:35.058253 139803493142784 xla_bridge.py:353] Unable to initialize backend 'plugin': xla_extension has no attributes named get_plugin_device_client. Compile TensorFlow with //tensorflow/compiler/xla/python:enable_plugin_device set to true (defaults to false) to enable this.
I1130 21:29:35.059734 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.059441: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 103026786304 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.060191 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.059927: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 92724109312 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.060706 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.060484: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 83451699200 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.061046 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.060830: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 75106525184 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.061371 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.061158: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 67595870208 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.061750 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.061529: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 60836282368 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.062093 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.061863: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 54752653312 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.062418 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.062212: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 49277386752 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.062761 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.062543: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 44349648896 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.063068 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.062867: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 39914684416 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
I1130 21:29:35.063122 140208603498304 run_docker.py:258] 2023-11-30 13:29:35.062887: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:767] failed to alloc 35923214336 bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
Old GPUs do not support Unified Memory.
Try remove these lines from run script:
'TF_FORCE_UNIFIED_MEMORY': '1',
'XLA_PYTHON_CLIENT_MEM_FRACTION': '4.0',
Maybe the workaround in #863 can help you...