jax
jax copied to clipboard
installation error: No library found under: /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.1
running that "python3 build.py --enable_cuda"
No library found under: /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.1
but what I have there is libcudart.so.11.0 and libcudart.so.11.1.74.
My cuda version is 11.1 and it passed the test.
I haven't tried building with cuda 11.1, only 11.0, so it's possible something's wrong with the build script. Can you post the full output of the build.py command?
Alternatively, if you can downgrade to cuda 11.0, then you can use our pre-built cuda wheels, or building from source may be easier.
Yes, of course:
Starting local Bazel server and connecting to it... Bazel binary path: ./bazel-2.0.0-linux-x86_64 Python binary path: /usr/bin/python3 Python version: 3.6 MKL-DNN enabled: yes -march=native: no CUDA enabled: yes CUDA compute capabilities: 3.5,5.2,6.0,6.1,7.0
Building XLA and installing it in the jaxlib source tree... ./bazel-2.0.0-linux-x86_64 run --verbose_failures=true --config=short_logs --config=mkl_open_source_only --config=cuda --define=xla_python_enable_gpu=true :install_xla_in_source_tree /home/reza/jax/build INFO: Options provided by the client: Inherited 'common' options: --isatty=0 --terminal_columns=80 INFO: Reading rc options for 'run' from /home/reza/jax/.bazelrc: Inherited 'common' options: --experimental_repo_remote_exec INFO: Reading rc options for 'run' from /home/reza/jax/.bazelrc: Inherited 'build' options: --repo_env PYTHON_BIN_PATH=/usr/bin/python3 --python_path=/usr/bin/python3 --repo_env TF_NEED_CUDA=1 --action_env TF_CUDA_COMPUTE_CAPABILITIES=3.5,5.2,6.0,6.1,7.0 --distinct_host_configuration=false --copt=-Wno-sign-compare -c opt --apple_platform_type=macos --macos_minimum_os=10.9 --announce_rc --define open_source_build=true --define=no_aws_support=true --define=no_gcp_support=true --define=no_hdfs_support=true --define=no_kafka_support=true --define=no_ignite_support=true --define=grpc_no_ares=true --spawn_strategy=standalone --strategy=Genrule=standalone --cxxopt=-std=c++14 --host_cxxopt=-std=c++14 INFO: Found applicable config definition build:short_logs in file /home/reza/jax/.bazelrc: --output_filter=DONT_MATCH_ANYTHING INFO: Found applicable config definition build:mkl_open_source_only in file /home/reza/jax/.bazelrc: --define=tensorflow_mkldnn_contraction_kernel=1 INFO: Found applicable config definition build:cuda in file /home/reza/jax/.bazelrc: --crosstool_top=@local_config_cuda//crosstool:toolchain --define=using_cuda=true --define=using_cuda_nvcc=true Loading: Loading: 0 packages loaded INFO: Call stack for the definition of repository 'local_config_cuda' which is a cuda_configure (rule definition at /home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl:1407:18):
- /home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/tensorflow/workspace.bzl:98:5
- /home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/tensorflow/workspace.bzl:77:5
- /home/reza/jax/WORKSPACE:46:1
ERROR: An error occurred during the fetch of repository 'local_config_cuda':
Traceback (most recent call last):
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1377
_create_local_cuda_repository(<1 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1054, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 599, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 501, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.1
ERROR: Skipping ':install_xla_in_source_tree': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1377
_create_local_cuda_repository(<1 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1054, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 599, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 501, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.1
WARNING: Target pattern parsing failed.
ERROR: no such package '@local_config_cuda//cuda': Traceback (most recent call last):
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1377
_create_local_cuda_repository(<1 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1054, in _create_local_cuda_repository
_find_libs(repository_ctx, <2 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 599, in _find_libs
_check_cuda_libs(repository_ctx, <2 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 501, in _check_cuda_libs
execute(repository_ctx, <1 more arguments>)
File "/home/reza/.cache/bazel/_bazel_reza/a75292cea3133afbeddb51835094cc1e/ext1e/external/org_tensorflow/third_party/remote_config/common.bzl", line 208, in execute
fail(<1 more arguments>)
Repository command failed
No library found under: /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudart.so.11.1
INFO: Elapsed time: 0.970s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
ERROR: Build failed. Not running target
FAILED: Build did NOT complete successfully (0 packages loaded)
Traceback (most recent call last):
File "build.py", line 380, in
main() File "build.py", line 375, in main shell(command) File "build.py", line 47, in shell output = subprocess.check_output(cmd) File "/usr/lib/python3.6/subprocess.py", line 356, in check_output **kwargs).stdout File "/usr/lib/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['./bazel-2.0.0-linux-x86_64', 'run', '--verbose_failures=true', '--config=short_logs', '--config=mkl_open_source_only', '--config=cuda', '--define=xla_python_enable_gpu=true', ':install_xla_in_source_tree', '/home/reza/jax/build']' returned non-zero exit status 1.
For some reason, I can not do the downgrade. Any suggestions on how to fix it?
Hey, sorry for the delay on this. I'm not sure what's going on with your build unfortunately, but I was able to build CUDA 11.1 jaxlib wheels in a docker container (via build_jaxlib_wheels.sh). Can you use the just-released prebuilt CUDA 11.1 wheel, or are you trying to build from source for other reasons?
I haven't updated the installation instructions in the README yet but you should be able to use this command:
pip install --upgrade jax jaxlib==0.1.56+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html
I think we can close this issue since there's been no activity for a long time, and there was no response to the suggestions about how to fix things.