rules_rust
rules_rust copied to clipboard
crate_universe: Rewrite generator in starlark
This is just an idea, I'd really appreciate some feedback and ideas.
Is it conceivably possible to rewrite the generator in pure starlark? This would remove the bootstrapping problem of distributing generator binaries or hackily building the generator with the native cargo toolchain.
https://github.com/bazelbuild/rules_rust/blob/59fab4e79f62bfa13551ac851a40696c24c0c3a4/crate_universe/deps_bootstrap.bzl#L9
We're using a custom generator to work around some issues, but this makes the analysis phase almost a minute slower as repository rules are not remotely cacheable.
Finished release [optimized] target(s) in 38.77s
Failing a full rewrite, some more ideas:
- Rewrite only some functionality in starlark.
- Move some functionality out of the repository rule and into just a normal rule. This could be achieved with a macro. The normal rule could then check if repinning is required, whilst allowing the repository rule to generate the build files it needs.
Thanks for taking the time to read this :) Please don't hesitate to ask any questions.
I'm not entirely sure the generator can be entirely written in starlark but the renderer side of it very likely could be and think that'd be great!
We're using a custom generator to work around some issues, but this makes the analysis phase almost a minute slower as repository rules are not remotely cacheable.
Finished release [optimized] target(s) in 38.77s
To clarify, are you building a new generator in repository rules? crate_universe
as never designed for users to be compiling the generator in repository rules. Instead, the expectation is that users who want to compile their own binaries can do so and upload them to some static asset host (s3, artifactory) where devs and CI would pull them down for use in their builds.
That said though, if the renderer were to be implemented in starlark, then there would be no need to download or build a generator in most cases since if you have a trusted lockfile, the only computation left to do is write BUILD files from pre-defined templates which is something doable in most lanugages.
Sounds great, thanks! I have some pretty good ideas on how to tackle this and I think the implementation could be quite nice.
I was thinking about the generator a bit and don't see why that couldn't also happen in Starlark. The cargo binary will happy output JSON which makes it easy to work with.
I've made a prototype implementation and it looks pretty good so far. It has shown some weirdness with the format of the lockfile, but has been easy to work with otherwise.
The blocker at the moment is figuring out how to handle selects and cfg expressions (conditional compilation, syntax reference). Example:
cfg(target_family = "unix")
The lockfile isn't quite as data-oriented as it initially seems. It expects tools which work with the lockfile to be able to parse and evaluate these expressions.
Does it seem reasonable to suggest a slight change to the lockfile? I don't see any drawbacks for an explicit list of targets as opposed to a cfg expression.
@UebelAndre
Does it seem reasonable to suggest a slight change to the lockfile? I don't see any drawbacks for an explicit list of targets as opposed to a cfg expression.
Can you provide an example of the specific change you're thinking of?
Yep, of course.
So here's an excerpt from the JSON lockfile (represented as CUE for brevity):
"sha2 0.10.2": common_attrs: deps: selects: {
"cfg(any(target_arch = \"aarch64\", target_arch = \"x86_64\", target_arch = \"x86\"))": [{
id: "cpufeatures 0.2.2"
target: "cpufeatures"
}]
}
Currently the "selects" deps are a map of cfg expressions to deps. I would propose that instead it should be a list of objects which include a list of platform names and deps. Such:
"sha2 0.10.2": common_attrs: deps: selects: [{
platforms: [
"aarch64-apple-darwin",
"aarch64-apple-ios",
"aarch64-linux-android",
"aarch64-unknown-linux-gnu",
"i686-apple-darwin",
"i686-linux-android",
"i686-pc-windows-msvc",
"i686-unknown-freebsd",
"i686-unknown-linux-gnu",
"x86_64-apple-darwin",
"x86_64-apple-ios",
"x86_64-linux-android",
"x86_64-pc-windows-msvc",
"x86_64-unknown-freebsd",
"x86_64-unknown-linux-gnu",
]
deps: [{
id: "cpufeatures 0.2.2"
target: "cpufeatures"
}]
}
That'd probably be fine but seems like it'd be noisy. The reason I wrote the cfg string is because it was shorter and slightly more meaningful. The lockfile includes a mapping of those strings to a list of platforms so they only need to be defined once. Would it still work to use that?
"conditions": {
"cfg(any(target_arch = \"aarch64\", target_arch = \"x86_64\", target_arch = \"x86\"))": [
"aarch64-apple-darwin",
"aarch64-apple-ios",
"aarch64-linux-android",
"aarch64-unknown-linux-gnu",
"i686-apple-darwin",
"i686-linux-android",
"i686-pc-windows-msvc",
"i686-unknown-freebsd",
"i686-unknown-linux-gnu",
"x86_64-apple-darwin",
"x86_64-apple-ios",
"x86_64-linux-android",
"x86_64-pc-windows-msvc",
"x86_64-unknown-freebsd",
"x86_64-unknown-linux-gnu"
]
}
I completely overlooked the conditions
in the lockfile, thanks! I think that will work fine.
Is there a reference which explains the format of the lockfile? Something which isn't source code? Might be helpful for future readers.
re: it'd be noisy
I agree, though the lockfile is already very large.
❯ cat Cargo.Bazel.lock | wc -l
10719
So, I got it all working! :)
Just need to clean some stuff up. Hopefully will open a PR for this within the next few days.
If you're interested, this is the dev branch: https://github.com/bazelbuild/rules_rust/compare/main...uhthomas:1287.
There are a few more things to do, but it's possible to build and run the cargo-bazel binary with these changes which is a huge milestone.
❯ bazel run @rules_rust//crate_universe:bin
(11:08:42) INFO: Invocation ID: 94c7a23a-a6fa-4f96-8d8a-3639c3d5fa1a
(11:08:42) INFO: Current date is 2022-06-13
(11:08:42) INFO: Analyzed target @rules_rust//crate_universe:bin (0 packages loaded, 0 targets configured).
(11:08:42) INFO: Found 1 target...
Target @rules_rust//crate_universe:cargo_bazel_bin up-to-date:
bazel-bin/external/rules_rust/crate_universe/cargo_bazel_bin
(11:08:42) INFO: Elapsed time: 0.226s, Critical Path: 0.01s
(11:08:42) INFO: 1 process: 1 internal.
(11:08:42) INFO: Build completed successfully, 1 total action
(11:08:42) INFO: Running command line: bazel-bin/external/rules_rust/crate_universe/cargo_bazel_b(11:08:42) INFO: Build completed successfully, 1 total action
cargo-bazel 0.2.2
USAGE:
cargo_bazel_bin <SUBCOMMAND>
OPTIONS:
-h, --help Print help information
-V, --version Print version information
SUBCOMMANDS:
generate Generate Bazel Build files from a Cargo manifest
help Print this message or the help of the given subcommand(s)
query Query workspace info to determine whether or not a repin is needed
splice Splice together disjoint Cargo and Bazel info into a single Cargo workspace
manifest
vendor Vendor BUILD files to the workspace with either repository definitions or `cargo
vendor` generated sources
@UebelAndre Would you mind checking out my branch and trying it out? It's not feature complete, though rules_rust builds with it (bazel builld //...
) and we're also now using it internally. It seems to work well and bazel run @crate_universe_crate_index//:repin
is blissful in comparison to the bazel sync
stuff.
The vendoring of BUILD files had to be removed, apologies if this is a problem. The way the manifest is interpreted by starlark makes it infeasible.
However bazel query --output=build @crate_universe_crate_index//...
or bazel query --output=build @crate_universe_crate_index__toml-0.5.9//...
does show the expanded macros.
Looking forward to hearing your experience!