Return, don't fail, when extension used by non-root module
This fixes the following scenario:
- Two bazel modules, both of which depend on toolchains_llvm
- Both modules are bzlmod enabled with
MODULE.bzlmodfiles - Both
MODULE.bzlmodfiles specify a llvm toolchain - The second modules depends on the first via
bazel_dep
Expected results
- the second module can depend on the first, and operations use the llvm toolchain that it defines
Actual results
- build fails with
Error in fail: Only the root module can use the 'llvm' extension
Resolution
Instead of failing, we just return. This keeps the behaviour of only being able to actually use the llvm extension in the root module, but we don't error out.
A more complex but similar example is that from rules_python where they will still register the toolchain, but it will not be marked as the default toolchain if it is not called from the root module: https://github.com/bazelbuild/rules_python/blob/main/python/private/bzlmod/python.bzl#L102
As per @fmeum's comment in #243, dev_dependency is one "solution" for this. I'm definitely open to the tack this PR takes though; the dev_dependency route doesn't seem ideal.
But, from a composability standpoint, doing nothing for non-root modules doesn't seem ideal either though; I see a couple of issues:
- rdeps of the module that uses
toolchains_llvmwill have to pull in and configuretoolchains_llvmthemselves — even if these rdeps don't themselves make direct use of a C/C++/etc toolchain or anything else fromtoolchains_llvm - If I understand correctly how bzlmod extensions (with
isolate = False) work, rdeps configuringtoolchains_llvmwill need to be careful to use the same repo names as their dependencies so thatuse_repo/register_toolchainscalls in their deps'MODULE.bazelfiles don't error
For the common use case (using toolchains_llvm solely via the registered C/C++ toolchain and not using other outputs like @llvm_toolchain//:omp, using llvm_toolchain as the repo name, rdeps that themselves would need/want toolchains_llvm anways) the above is perhaps acceptable: configuring a C/C++ toolchain is arguably something that's best decided by the top-level module anyways. (Though dev_dependency is maybe a better solution here; it sidesteps the repo name issue...)
However for modules that use other outputs from llvm_toolchain than just the C/C++ toolchain (i.e. binaries like clang-format, libraries like OpenMP) the first issue seems unacceptable — I should be able to properly encapsulate my module's OpenMP dep, users of my module shouldn't need to be aware of it (afaik this is not possible under a "skip non-root modules" scheme, even with isolate = True).
The least-worst solution I can come up with that addresses the above is to have toolchains_llvm create repositories for all modules and to handle conflicts (in repo names) by choosing the tag that's closest to the root module (as rules_python does) and by warning users/nudging them towards isolate = True.
re: gating toolchain resolution for non-root modules: I'm not super convinced that we need to do this (my understanding is that toolchains registered by the root module/by modules closer to the root module have precedence in toolchain resolution; rules_python's reasoning for this restriction is here) but if we want to, it should be easy to emulate with :all patterns that potentially expand to nothing or by using the hub pattern that rules_python uses (unlike rules_python we don't provide a "default" toolchain in toolchains_llvm though).
@jsharpe @fmeum: WDYT?
Yeah, using dev_dependency=True is totally fair for a lot of use cases. If we want to keep the fail call then perhaps we should update the message to suggest this solution.
Based on your analysis, I believe it means that my change shouldn't be needed for simple cases, and doesn't solve the problem for more complex ones. Is that accurate?
I see two separate threads here:
- Having one "official" way of using
toolchains_llvmin a module that is also meant to be used by other modules. - Supporting transitive module's needs to access LLVM toolchains without the root module's support.
For 1., I think that we should stick to and recommend dev_dependency: It's built right into Bazel, which means that it works uniformly for all extensions without special support, and is also self-descriptive in that you can immediately tell a certain extension tag won't do anything if the module isn't the root module.
Keeping the fail but having its message suggest using dev_dependency is thus my favorite approach within the scope of this PR.
Dealing with 2. is much more difficult, which is why I would prefer to keep the fail until we have properly thought through the interactions around this. It likely entails moving away from user-supplied register_toolchains calls, as rules_python and rules_go have done. This does add a fair share of complexity though. isolate = True is also not a sufficient answer as it can result in linking the same library twice in different versions.
For tools such as clang-format, having a module register an LLVM C++ toolchain just to access it sounds overly complicated to me. We could offer a more lightweight way to achieve this (say by not registering the cc_toolchain, thus avoiding the problem of reconciling toolchain requirements) or direct such modules to accept a label pointing to clang-format as part of their configuration instead of bringing it in themselves.
I do lack the knowledge about OpenMP to assess what a good way would be to bring it in. Does it have be kept in sync with compiler versions?