Switch to the new C++ toolchains API
The new BUILD file based C++ toolchains promise to be much easier to maintain.
https://github.com/bazelbuild/rules_cc/tree/main/examples/rule_based_toolchain
What do you all think about switching over to using those? I'm happy to help.
The example makes it look like we can reuse pieces from rules_cc.
cc_toolchain(
name = "host_clang",
args = select({
"@platforms//os:linux": [
"//toolchain/args:linux_sysroot",
],
"//conditions:default": [],
}) + [
"//toolchain/args:no_canonical_prefixes",
"//toolchain/args:warnings",
],
enabled_features = ["@rules_cc//cc/toolchains/args:experimental_replace_legacy_action_config_features"],
known_features = ["@rules_cc//cc/toolchains/args:experimental_replace_legacy_action_config_features"],
tool_map = "//toolchain/tools:all_tools",
)
toolchain(
name = "host_cc_toolchain",
toolchain = ":host_clang",
toolchain_type = "@bazel_tools//tools/cpp:toolchain_type",
)
note that this repo mostly just doesn't use any of the old stuff for this either since it just shells out to the default toolchain setup code. https://github.com/bazel-contrib/toolchains_llvm/blob/bb3f7d0171129e8f168f51f3deebdbe010e2bf00/toolchain/cc_toolchain_config.bzl#L368
Yup. I'm trying to upstream a bunch of fixes we've built up over the years that let us cross compile against sysroots. I'm quickly finding all the places where new flags need to be injected in very specific spots through unix_cc_toolchain_config which don't exist.
I think this will enable use of rule-based sysroots. Currently toolchains_llvm seems to only set correct paths for http_archive or for in-source filegroup, but not for genrule outputs or other ways to produce sysroots. While new API in rules_cc looks more promising handling generated sysroots (Note: there's still a bit of hacking needed to have a rule that produces multiple files, just producing sysroot archive is much more simpler but unusable as sysroot directly)
Keith and I (more Keith than me) sent some changes upstream to rules_cc to get things closer to working there, but most of those changes haven't landed yet. I think it is clear from playing with the new API that it isn't fully debugged and tested.
FWIW I rolled out a toolchain internally using the new stuff with a custom sysroot and vendored clang and it's working well. But it does still require some patches on rules_cc
@keith, are the patches needed just the ones you've sent out for review on rules_cc? Or are there more?
All submitted, no private ones. But I do have a significant amount of internal setup that isn't clear how it should fit upstream yet. For example sanitizer flags etc.
Would you be able to share current patch to toolchains_llvm that sets up new API (to give it a try ahead of rules_cc accepting the changes)?
I'm not using toolchains_llvm at all. since I don't need the conditional setup of versions of llvm or the other things it does when im vendoring exactly what i need.
That works too, I'm just not quite sure how much setup is needed to get to a "simple hermetic toolchain" setup using new APIs, bare-bones pointing tools to compiler binaries seems to miss many default or required flags and also differs noticeably from toolchains_llvm (it's nice that aquery can be used to discover differences but a proven starting point would be nice too 🙂 )
If you want a proven starting point, https://github.com/RealtimeRoboticsGroup/aos/tree/main/third_party/bazel-toolchain is mine. It supports x86 debian, and a NVIDIA Orin yocto. It is based off an old version of toolchains_llvm. I'm slowly trying to get the patches upstreamed to stop maintaining a fork, but keep running into gaps in what is there in rules_cc.
there is also https://github.com/bazelbuild/rules_cc/blob/main/examples/rule_based_toolchain/toolchains/ but it's pretty bare bones
there's also https://github.com/google/pigweed/tree/main/pw_toolchain/cc but it's a bit interesting since much of it is targeting embedded things
This repo was mostly written following the https://bazel.build/tutorials/ccp-toolchain-config. New rule based toolchains seem much cleaner. I'd be also happy to give it a try.
I think this will enable use of rule-based sysroots
I experimented with rule based toolchains a bit, unfortunately it does not help much performance since sysroot is expected to be extract during loading phase, which is not so different than traditional cc toolchains. I filed https://github.com/bazelbuild/bazel-skylib/issues/566 to see what can be done there.
I think this will enable use of rule-based sysroots
I experimented with rule based toolchains a bit, unfortunately it does not help much performance since sysroot is expected to be extract during loading phase, which is not so different than traditional cc toolchains. I filed bazelbuild/bazel-skylib#566 to see what can be done there.
I haven't reached a fully working setup yet, but so far it looks workable with approximately
load("@rules_cc//cc/toolchains/args:sysroot.bzl", "cc_sysroot")
load("@rules_cc//cc/toolchains:toolchain.bzl", "cc_toolchain")
load("@rules_cc//cc/toolchains:tool_map.bzl", "cc_tool_map")
genrule(
name = "sysroot.extract",
srcs = [targets_producing_tarballs],
outs = [magically_generated_listing_of_sysroot],
cmd = ...just untar...
)
filegroup(
name = "sysroot",
srcs = [magically_generated_listing_of_sysroot]
)
cc_sysroot(
name = "linux_sysroot",
data = [":sysroot"],
sysroot = ":sysroot",
)
cc_tool_map(
name = "toolmap",
tools = {
"@rules_cc//cc/toolchains/actions:c_compile": "@@toolchains_llvm++llvm+llvm_toolchain_linux_exec_llvm//:bin/clang",
"@rules_cc//cc/toolchains/actions:cpp_compile": "@@toolchains_llvm++llvm+llvm_toolchain_linux_exec_llvm//:bin/clang++",
}
)
cc_toolchain(
name = "cctoolchain",
args = [":linux_sysroot"],
tool_map = ":toolmap",
)
Will post back once I have it fully working on some codebase, it lacks some essential flags in the shape above to make all includes resolve properly
I have began the work to migrate toolchains_llvm to rules based cc toolchain, will make a PR early next week.
haven't reached a fully working setup yet, but so far it looks workable with approximately
This still does not work as you need to know the list of files in each directory and declare them with declare_file which defeats the purpose of using a treeartifact since it'll be still slow, maybe not as slow if the sysroot got extracted during repository rule loading.
I have made a good progress on this one. There are some blockers that i'd like to get out of the way before this becomes a reality. this repo also needs a bit heavy refactoring to allow use of new api.
I'll update this as i find new issues.
Current blockers/issues
-
Use bare paths everywhere which does not work with the new rules based toolchains. I have migrated some of it, but there are parts that can't be converted easily, eg paths that are available on the host system. See: https://github.com/bazelbuild/rules_cc/issues/277
-
Sysroot logic assumes that the label provided by the user will already contain all the files extracted and laid out on disk, we need a contract there based on new
directoryandsubdirectoryrules. -
On MacOS, sysroot is provided by the host which can't be tracked by bazel unless we copy all of into via repository_rule. https://github.com/bazelbuild/rules_cc/issues/242#issuecomment-2828575745