Hi guys, How to solve this issue ?
I have adjusted tensorflow version 2.15.1
Python 3.11
Other requirements are followed Alphofold general.
CUDA 18.0 has problem with RTX4090 ?
I1119 14:49:41.348441 140491874862912 run_docker.py:258] 2023-11-19 05:49:41.347342: W external/org_tensorflow/tensorflow/compiler/xla/stream_executor/gpu/asm_compiler.cc:234] Used ptxas at ptxas
I1119 14:49:41.358434 140491874862912 run_docker.py:258] 2023-11-19 05:49:41.357955: E external/org_tensorflow/tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:628] failed to get PTX kernel "shift_right_logical" from module: CUDA_ERROR_NOT_FOUND: named symbol not found
I1119 14:49:41.358683 140491874862912 run_docker.py:258] 2023-11-19 05:49:41.358008: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2153] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function
I1119 14:49:41.358900 140491874862912 run_docker.py:258] Traceback (most recent call last):
I1119 14:49:41.359112 140491874862912 run_docker.py:258] File "/app/alphafold/run_alphafold.py", line 570, in
I1119 14:49:41.359349 140491874862912 run_docker.py:258] app.run(main)
I1119 14:49:41.359516 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/absl/app.py", line 312, in run
I1119 14:49:41.359711 140491874862912 run_docker.py:258] _run_main(main, args)
I1119 14:49:41.359894 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/absl/app.py", line 258, in _run_main
I1119 14:49:41.360071 140491874862912 run_docker.py:258] sys.exit(main(argv))
I1119 14:49:41.360242 140491874862912 run_docker.py:258] File "/app/alphafold/run_alphafold.py", line 543, in main
I1119 14:49:41.360380 140491874862912 run_docker.py:258] predict_structure(
I1119 14:49:41.360506 140491874862912 run_docker.py:258] File "/app/alphafold/run_alphafold.py", line 284, in predict_structure
I1119 14:49:41.360635 140491874862912 run_docker.py:258] prediction_result = model_runner.predict(processed_feature_dict,
I1119 14:49:41.360762 140491874862912 run_docker.py:258] File "/app/alphafold/alphafold/model/model.py", line 167, in predict
I1119 14:49:41.360888 140491874862912 run_docker.py:258] result = self.apply(self.params, jax.random.PRNGKey(random_seed), feat)
I1119 14:49:41.361012 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/random.py", line 132, in PRNGKey
I1119 14:49:41.361175 140491874862912 run_docker.py:258] key = prng.seed_with_impl(impl, seed)
I1119 14:49:41.361302 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/prng.py", line 267, in seed_with_impl
I1119 14:49:41.361433 140491874862912 run_docker.py:258] return random_seed(seed, impl=impl)
I1119 14:49:41.361556 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/prng.py", line 580, in random_seed
I1119 14:49:41.361679 140491874862912 run_docker.py:258] return random_seed_p.bind(seeds_arr, impl=impl)
I1119 14:49:41.361800 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/core.py", line 329, in bind
I1119 14:49:41.361921 140491874862912 run_docker.py:258] return self.bind_with_trace(find_top_trace(args), args, params)
I1119 14:49:41.362047 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/core.py", line 332, in bind_with_trace
I1119 14:49:41.362167 140491874862912 run_docker.py:258] out = trace.process_primitive(self, map(trace.full_raise, args), params)
I1119 14:49:41.362291 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/core.py", line 712, in process_primitive
I1119 14:49:41.362415 140491874862912 run_docker.py:258] return primitive.impl(*tracers, **params)
I1119 14:49:41.362536 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/prng.py", line 592, in random_seed_impl
I1119 14:49:41.362657 140491874862912 run_docker.py:258] base_arr = random_seed_impl_base(seeds, impl=impl)
I1119 14:49:41.362808 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/prng.py", line 597, in random_seed_impl_base
I1119 14:49:41.362939 140491874862912 run_docker.py:258] return seed(seeds)
I1119 14:49:41.363066 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/prng.py", line 832, in threefry_seed
I1119 14:49:41.363230 140491874862912 run_docker.py:258] lax.shift_right_logical(seed, lax_internal._const(seed, 32)))
I1119 14:49:41.363362 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/lax/lax.py", line 515, in shift_right_logical
I1119 14:49:41.363493 140491874862912 run_docker.py:258] return shift_right_logical_p.bind(x, y)
I1119 14:49:41.363621 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/core.py", line 329, in bind
I1119 14:49:41.363751 140491874862912 run_docker.py:258] return self.bind_with_trace(find_top_trace(args), args, params)
I1119 14:49:41.363877 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/core.py", line 332, in bind_with_trace
I1119 14:49:41.364001 140491874862912 run_docker.py:258] out = trace.process_primitive(self, map(trace.full_raise, args), params)
I1119 14:49:41.364126 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/core.py", line 712, in process_primitive
I1119 14:49:41.364253 140491874862912 run_docker.py:258] return primitive.impl(*tracers, **params)
I1119 14:49:41.364377 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/dispatch.py", line 115, in apply_primitive
I1119 14:49:41.364502 140491874862912 run_docker.py:258] return compiled_fun(*args)
I1119 14:49:41.364630 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/dispatch.py", line 200, in
I1119 14:49:41.364755 140491874862912 run_docker.py:258] return lambda *args, **kw: compiled(*args, **kw)[0]
I1119 14:49:41.364878 140491874862912 run_docker.py:258] File "/opt/conda/lib/python3.10/site-packages/jax/_src/dispatch.py", line 895, in _execute_compiled
I1119 14:49:41.365002 140491874862912 run_docker.py:258] out_flat = compiled.execute(in_flat)
I1119 14:49:41.365125 140491874862912 run_docker.py:258] jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function
Bro, I just solve this problem several days ago. Change the Dockerfile from
ARG CUDA=11.1.1
FROM nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu18.04
to
ARG CUDA=11.8.0
FROM nvidia/cuda:${CUDA}-cudnn8-devel-ubuntu22.04
And re-build the docker. That will be OK.
Bro, I just solve this problem several days ago. Change the Dockerfile from ARG CUDA=11.1.1 FROM nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu18.04 to ARG CUDA=11.8.0 FROM nvidia/cuda:${CUDA}-cudnn8-devel-ubuntu22.04 And re-build the docker. That will be OK.
By the way, I initially used
ARG CUDA=11.8.0
FROM nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu18.04
which does not work.
Then, I change to
ARG CUDA=11.8.0
FROM nvidia/cuda:${CUDA}-cudnn8-devel-ubuntu18.04
It works!
thanks this worked for me too!