rules_cuda icon indicating copy to clipboard operation
rules_cuda copied to clipboard

How to use cublas in a non-root bazel module?

Open appthumb opened this issue 1 year ago • 7 comments
trafficstars

I'm using rules_cuda in a bazel MODULE A, and some of my cuda_library needs to link with -lcubas and -lcublasLt.

Naturally, I'm defining local_cuda like examples/cublas/BUILD.bazel. . In my MODULE.bazel of module A:

bazel_dep(
    name = "rules_cuda",
    version = "0.2.1",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
cuda.local_toolchain(
    name = "local_cuda",
    toolkit_path = "",
)
use_repo(cuda, "local_cuda")

And in my BUILD.bazel, I have the cuda_library target:

cuda_library(
    name = ...,
    srcs = ...,
    deps = [
       "@local_cuda//:cublas",
    ],
    ...
)

This works fine when I build my module A. However, when I have another bazel module B that depends on module A, I cannot build module B, because the local_cuda can only be declared in a root module. I got this error:

ERROR: Traceback (most recent call last): File "/private/var/tmp/_bazel_dev/67de6cda420db4eb86e6ad3f1fd2b6e4/external/rules_cuda~/cuda/extensions.bzl", line 15, > column 21, in _init fail("Only the root module may override the path for the local cuda toolchain") Error in fail: Only the root module may override the path for the local cuda toolchain

This is from this line: https://github.com/bazel-contrib/rules_cuda/blob/8f2f2e6d64d38e46d09538c921304c7c902a2564/cuda/extensions.bzl#L15.

Is it possible to use rules_cuda in a bazel module that other modules can depend? I don't really need to customize my cuda path, as the default path works fine with me. Is there a way to avoid the above error?

appthumb avatar Apr 18 '24 22:04 appthumb

With current impl, it is not possible.

But it is very reasonable to let upstream projects expose targets that depend on rules_cuda, say, kernels wrapped in a c library with c public interfaces. Current project and downstream projects should be able to use it without pain.

I'll see how we should improve the situation.

cloudhan avatar Apr 19 '24 00:04 cloudhan

Thanks for looking into this! Really appreciated.

Do you mean that my upstream module A uses cuda_library internally, and exposes it with cc_library, and my downstream module B depends on it?

This doesn't seem to work -- it looks as long as I use cuda_library rule in module A, I need to add local_cuda to the MODULE.bazel of module A, and this forces A to be a top-level module.

I tried to remove everything that's referring to local_cuda in module A, so in A's MODULE.bazel file I only have:

bazel_dep(
    name = "rules_cuda",
    version = "0.2.1",
)

and I use coda_library in A's BUILD.bazel file:

load("@rules_cuda//cuda:defs.bzl", "cuda_library")

cuda_library(
    name = "kernel",
    srcs = ["kernel.cu"],
    hdrs = ["kernel.h"],
)

then A cannot compile by bazel. bazel build kernel gives this error:

Analysis of target '//my_project:kernel' failed; build aborted: module extension "toolchain" from "@@rules_cuda~//cuda:extensions.bzl" does not generate repository "local_cuda", yet it is imported as "local_cuda" in the usage at https://bcr.bazel.build/modules/rules_cuda/0.2.1/MODULE.bazel:10:26

This is referring to https://github.com/bazel-contrib/rules_cuda/blob/8f2f2e6d64d38e46d09538c921304c7c902a2564/MODULE.bazel#L10

Adding local_cuda to A's MODULE.bazel would make bazel compile A, but then module B cannot depend on it.

appthumb avatar Apr 19 '24 02:04 appthumb

Oh, didn't see a fix is in the making! Looking forward to it 👍

appthumb avatar Apr 19 '24 02:04 appthumb

I think its possible to just do:

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
use_repo(cuda, "local_cuda")

in B's MODULE.bazel which will make local_cuda available to B - its not ideal but I think this works - I have something similar in one of my projects.

jsharpe avatar Apr 19 '24 08:04 jsharpe

yes, this would work to compile B. The problem is that this won't compile A if A has some cuda_library target. This can be annoying, e.g., all the compilation and testing of the cuda code in A now has to be done through module B.

appthumb avatar Apr 19 '24 22:04 appthumb

You would leave your A module with the use_repo that you had above. the above snippet makes local_cuda visible in B and you can use cuda_library in A or B.

jsharpe avatar Apr 21 '24 14:04 jsharpe

Thanks for your response! I kind of get it work for my purpose, by using different MODULE.bazel files in my local bazel registry and in the repo_a. Here's a summary of what I have found so far.

My setup:

  • A bazel_registry folder that serves as my local bazel registry. It has a module repo_a using local_path (https://bazel.build/external/registry#index_registry), so that my repo_b can find repo_a. The bazel registry doesn't serve repo_a's source files, instead it just points to the actual source code directory of repo_a using the local_path feature. Specifically, my bazel_registry contains the subfolder bazel_registry/modules/repo_a/1.0.0 as the registry for repo_a, and especially there is this file bazel_registry/modules/repo_a/1.0.0/MODULE.bazel.

  • A repo_a, which has the repo_a's source code, BUILD.bazel files, and a different MODULE.bazel file.

  • A repo_b, which depends on repo_a through my local bazel_registry.

Now this is the complete file content of bazel_registry/modules/repo_a/1.0.0/MODULE.bazel:

"""repo A."""

module(
    name = "repo_a",
    version = "1.0.0",
)

bazel_dep(
    name = "bazel_skylib",
    version = "1.5.0",
)

bazel_dep(
    name = "rules_cc",
    version = "0.0.9",
)

bazel_dep(
    name = "rules_cuda",
    version = "1.0.0",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")
use_repo(cuda, "local_cuda")

This is the complete content of repo_b/MODULE.bazel:

"""repo_b"""

module(
    name = "repo_b",
    version = "1.0.0",
)

bazel_dep(
    name = "bazel_skylib",
    version = "1.5.0",
)

bazel_dep(
    name = "rules_cc",
    version = "0.0.9",
)

bazel_dep(
    name = "repo_a",
    version = "1.0.0",
)

bazel_dep(
    name = "rules_cuda",
    version = "1.0.0",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")

cuda.local_toolchain(
    name = "local_cuda",
    toolkit_path = "",
)
use_repo(cuda, "local_cuda")

This works, and bazel build under repo_b succeeds. Note:

  • The coda.local_toolchain is declared in repo_b/MODULE.bazel, and not in bazel_registry/modules/repo_a/1.0.0/MODULE.bazel. Adding it to the latter will lead to the " fail("Only the root module may override the path for the local cuda toolchain")" error.

  • Whether or not bazel_registry/modules/repo_a/1.0.0/MODULE.bazel has the line use_repo(cuda, "local_cuda") doesn't make a difference.

Now repo_b works fine, but I also want repo_a to work, i.e., doing bazel build under repo_a should succeed. If I use the exact content of bazel_registry/modules/repo_a/1.0.0/MODULE.bazel as repo_a/MODULE.bazel, I will get the following error when I bazel build under repo_a:

failed; build aborted: module extension "toolchain" from "@@rules_cuda~//cuda:extensions.bzl" does not generate repository "local_cuda", yet it is imported as "local_cuda" in the usage at /home/dev/temp/cuda_test/repo_a/MODULE.bazel:23:21

To workaround this, I need to use a slightly different MODULE.bazel content under repo_a. This is the complete content of repo_a/MODULE.bazel:

"""repo A."""

module(
    name = "repo_a",
    version = "1.0.0",
)

bazel_dep(
    name = "bazel_skylib",
    version = "1.5.0",
)

bazel_dep(
    name = "rules_cc",
    version = "0.0.9",
)

bazel_dep(
    name = "rules_cuda",
    version = "1.0.0",
)

cuda = use_extension("@rules_cuda//cuda:extensions.bzl", "toolchain")

cuda.local_toolchain(
    name = "local_cuda",
    toolkit_path = "",
)
use_repo(cuda, "local_cuda")

Note that I add cuda.local_toolchain in it. This makes bazel build under repo_a work without any issue. This won't break the build under repo_b, since the latter uses a different MODULE.bazel file for repo_a.

So far I got both repo_a and repo_b work, by leveraging different MODULE.bazel files between the one in my local bazel registry, and the one in the actual repo, to bypass the requirement that toolchains must be defined at the top-level module.

Not sure if this is the canonical way of setting up local dependencies, and it feels like a hack. It will be nice if we can remove the limitation of toolchains declaration, and that avoids all these tricky situations and the MODULE.bazel files won't have to diverge between the one in the repo vs. the one in the bazel registry.

(PS: I'm using the head version of rules_cuda in this GitHub repository. The rules_cuda in Bazel Central Registry https://registry.bazel.build/modules/rules_cuda is 5 months behind the head version here. To avoid confusion I point to the head version of rules_cuda in my local bazel registry as well, and this is why you can see the version of rules_cuda is 1.0.0 above. I also tried the published version 0.2.1, and the result is the same).

appthumb avatar Apr 23 '24 02:04 appthumb