xla icon indicating copy to clipboard operation
xla copied to clipboard

Introduce hermetic CUDA in Google ML projects.

Open copybara-service[bot] opened this issue 1 year ago • 0 comments

Introduce hermetic CUDA in Google ML projects.

  1. Hermetic CUDA rules allow building wheels with GPU support on a machine without GPUs, as well as running Bazel GPU tests on a machine with only GPUs and NVIDIA driver installed. When --config=cuda is provided in Bazel options, Bazel will download CUDA, CUDNN and NCCL redistributions in the cache, and use them during build and test phases.

    Default location of CUNN redistributions

    Default location of CUDA redistributions

    Default location of NCCL redistributions

  2. To include hermetic CUDA rules in your project, add the following in the WORKSPACE of the downstream project dependent on XLA.

    Note: use @local_tsl instead of @tsl in Tensorflow project.

    load(
       "@tsl//third_party/gpus/cuda/hermetic:cuda_json_init_repository.bzl",
       "cuda_json_init_repository",
    )
    
    cuda_json_init_repository()
    
    load(
       "@cuda_redist_json//:distributions.bzl",
       "CUDA_REDISTRIBUTIONS",
       "CUDNN_REDISTRIBUTIONS",
    )
    load(
       "@tsl//third_party/gpus/cuda/hermetic:cuda_redist_init_repositories.bzl",
       "cuda_redist_init_repositories",
       "cudnn_redist_init_repository",
    )
    
    cuda_redist_init_repositories(
       cuda_redistributions = CUDA_REDISTRIBUTIONS,
    )
    
    cudnn_redist_init_repository(
       cudnn_redistributions = CUDNN_REDISTRIBUTIONS,
    )
    
    load(
       "@tsl//third_party/gpus/cuda/hermetic:cuda_configure.bzl",
       "cuda_configure",
    )
    
    cuda_configure(name = "local_config_cuda")
    
    load(
       "@tsl//third_party/nccl/hermetic:nccl_redist_init_repository.bzl",
       "nccl_redist_init_repository",
    )
    
    nccl_redist_init_repository()
    
    load(
       "@tsl//third_party/nccl/hermetic:nccl_configure.bzl",
       "nccl_configure",
    )
    
    nccl_configure(name = "local_config_nccl")
    

copybara-service[bot] avatar Mar 18 '24 20:03 copybara-service[bot]