allwpilib
allwpilib copied to clipboard
[bazel] MVP for building with bazel
This PR introduces the ability to build parts of wpilib with bazel
General principles for the MVP:
- Build C++ statically only
- Do not build JNI code
- Do not run Java tests (because there is no JNI)
Libraries Built:
- wpiutil / wpinet / ntcore / hal - Java, C++ w/ tests
- wpimath / cscore / cameraserver / wplibj - Java only
I'm still fighting with a landable solution for the recent protobuf change which is blocking wpimath. Once I get that going we can build up to wpilibc.
Other cool things I left out
- You can run the pregeneration and check that produces the correct results during a build. Because of the cache it will only get run "once", but this can remove the need for a separate CI step.
- PMD and checkstyle for Java
- Building the examples, which can be easily run with
bazel run //wpilibjExamples:rapidreact. I use the pregen to generate a build script so I left it out. - Building for raspbian / bullseye.
Implementation Notes For my fork I made the decision to get close to the "one bazel file per directory" structure. This results in a 2-6x number of BUILD.bazel files, but they are more compact and easily understandable. i.e.
wpiutil/src/examples/....
wpiutil/src/dev/java/BUILD.bazel
wpiutil/src/dev/native/BUILD.bazel
wpiutil/src/main/java/BUILD.bazel
wpiutil/src/main/native/BUILD.bazel
wpiutil/src/test/java/BUILD.bazel
wpiutil/src/test/native/BUILD.bazel
This could get all smushed to a single file like what happens with Gradle or Cmake
wpiutil/BUILD.bazel
I also chose to only implement the "legacy" version of dependencies. Bazel has introduced a new scheme that is supposed to mimic the simplicity of maven / pypi called bzlmod. It is easier to pull non-bazel-ified things with the WORKSPACE version instead of the MODULE version if needed
This is a bit heavier than an MVP, the first commit is landable and I can make a smaller PR with that if wanted.
If interested in a single-file-per-project version, you can take a look at this. Reduces the file count from 76 to 27, although the 9 projects build files become quite a bit larger
I like the single file per project version. I think that keeping everything together is better than having several short build files scattered throughout the repo, since it makes it easier to track down the references to dependencies.
I like the single file per project version. I think that keeping everything together is better than having several short build files scattered throughout the repo, since it makes it easier to track down the references to dependencies.
I think I agree. It more closely follows the ways other two build systems work, which might make it easier to get used to. I don't think I realized how much it ballooned after I started making sure every file had a bazel target attached to it.
I updated the PR to be the amalgamated version.
I'm not convinced supporting Bazel as a third build system is worth it.
Supporting Gradle and CMake is already enough of a maintanence burden and I don't see the role Bazel fills.
Right now Gradle provides the simple easy no thought required option that the vast majority of teams use in the form of GradleRIO and CMake provides a better developer experience for a lot of things and support for coprocessors in a variety of cases Gradle struggles with (needs specific OpenCV version, needs to be compiled on coprocessor).
Bazel can't fill the second role as CMake fills this role perfectly as it's available almost everywhere, which Bazel isn't at least not without a hassle.
Bazel could possibly fill the first role that Gradle fills but this PR simply doesn't do that. It doesn't have the vast majority of requirements for team code that Gradle does have and even if it did this would still be a huge decision to make as we now have years of experience working with Gradle and would be moving off our reasonably mature Gradle ecosystem to something that isn't the industry standard for either Java or C++ (the argument for moving C++ builds to CMake)
This PR is just a MVP to demonstrate it and discuss architecture and layout. PJ does have a complete build working. The end state would likely be replacement of Gradle with Bazel, not maintaining 3 build systems indefinitely.
We are starting to find dependencies we need which are bzlmod only, and it is clear that Bazel and the community is going that way. We are starting to ponder our own migration. I'd highly encourage you to start with bzlmod, especially for a new project like this. The cleanest way to do this long term would be to publish wpilib as a bzlmod module to BCR and either have it depend on toolchains also published to bcr, or provide the MODULE.bazel lines to add the toolchains. That would make the lines required by a new team to get started be quite small.
Also, fair warning, for something like wpilib, including flags (like C++ version, etc) in the .bazelrc files is an anti-pattern. It makes it so that it is very hard to depend on wpilib. That makes it so you have to both depend on the targets in wpilib, but also figure out how to include the flags across the repository boundary, or duplicate them and keep them up to date. The better way to do this is to move the flags into the toolchain.
Thanks for pushing on this! We've wanted upstream bazel support in wpilib for a long time, really exciting to see some momentum. We are very much a linux shop on 971, but happy to review for sure.
If there is interest, I have some contacts in the remote execution world and might be able to get access to resources for CI.
We are starting to find dependencies we need which are bzlmod only, and it is clear that Bazel and the community is going that way.
My fork supports bzlmod as do all of the FIRST related dependencies in the bzlmodRio org, published to my fork of the BCR. I thought going the WORKSPACE route would be the best candidate for a minimum viable product as it still has the best documentation, and it is easier to patch in third party libraries that have zero bazel support (usually just with a http_archive and build_file_contents). Another reason to that way with this MVP is that I've had trouble getting the roborio toolchain to cross compile on windows with bzlmod; it works on my machine but not on github CI.
Also, fair warning, for something like wpilib, including flags (like C++ version, etc) in the .bazelrc files is an anti-pattern
I agree. I didn't want to muck around with the auto-discovering native toolchains. I could add the native-tools defaults for the cross compilers, but it felt like a big lift to make custom toolchains for native linux/osx/windows.
I started this as a hobby when I joined a company that used bazel and was super frustrated at how long gradle builds took on my old laptop. I'm sure there is a lot of things that I'm doing stupidly and that you and 971 can help do the right way. If you want, feel free to reach out to me on CD to discuss, one bazel stan to another.
We are starting to find dependencies we need which are bzlmod only, and it is clear that Bazel and the community is going that way.
My fork supports bzlmod as do all of the FIRST related dependencies in the
bzlmodRioorg, published to my fork of the BCR. I thought going theWORKSPACEroute would be the best candidate for a minimum viable product as it still has the best documentation, and it is easier to patch in third party libraries that have zero bazel support (usually just with ahttp_archiveandbuild_file_contents). Another reason to that way with this MVP is that I've had trouble getting the roborio toolchain to cross compile on windows with bzlmod; it works on my machine but not on github CI.
Ah, great, I missed that. There's a difference between a MVP and final product. I think you've got a good long term view there.
Do you have a handy list of the dependencies which were missing from bzlmod which you need help with?
Do you have links to the windows toolchain failures to share, and some way to reproduce them? I've fought a toolchain or two in my life...
Also, fair warning, for something like wpilib, including flags (like C++ version, etc) in the .bazelrc files is an anti-pattern
I agree. I didn't want to muck around with the auto-discovering native toolchains. I could add the
native-toolsdefaults for the cross compilers, but it felt like a big lift to make custom toolchains for native linux/osx/windows.I started this as a hobby when I joined a company that used bazel and was super frustrated at how long gradle builds took on my old laptop. I'm sure there is a lot of things that I'm doing stupidly and that you and 971 can help do the right way. If you want, feel free to reach out to me on CD to discuss, one bazel stan to another.
The context helps. Our story is similar. James has started prodding some folks on 971 (That's why I'm replying now), sounds like we should definately collaborate.
I've brought up Bazel in FRC before to some folks at the Bazel conference in previous years, and gotten more excitement than expected. Are you or anyone else here going to be at this year's conference?
Do you have links to the windows toolchain failures to share, and some way to reproduce them
I made an issue in my toolchains repo with the failure. To reproduce you can simply build my allwpilib fork with --config=windows --enable_bzlmod
Do you have a handy list of the dependencies which were missing from bzlmod which you need help with?
I was mostly referring to publishing bzlmod wrappers for all the vendor dep things (i.e. photonvision, ctre, rev, etc). Those all consume the pre built maven artifacts. I did the same for wpi's fork of opencv and apriltag rather than building it from source. I've experimented with all of the open source vendordeps and got them building with bazel, but I think the average team would be fine using the maven artifacts than building everything from source.
Are you or anyone else here going to be at this year's conference?
No bazelcon for me, but someone from my company is a presenter
Also fyi, I'm satisfied with protobuf generation approach and have a MVP for everything but gui stuff and java tests ready to go. I could merge it into here but want to avoid this
Just to get things straight, is the reason we're looking into Bazel as a replacement for Gradle because of faster build times? Are there additional reasons?
I put some reasons here. To me one of the most compelling things is that it is designed to have first class java / cpp / python support. Gradle is fantastic for java, but requires lots of custom stuff for C++, cmake is fantastic and considered the industry standard for C++, but similarly requires hacks for java.
Just to get things straight, is the reason we're looking into Bazel as a replacement for Gradle because of faster build times? Are there additional reasons?
Gradle has a bus factor of 1. With bazel, the people on the bus can help with the build system too
A few more things I'd like to see:
- CI should show how we build all 4 binary variants (debug/release x static/shared)
- CI Maven artifact publishing of native libs
- Is a Win32 CI build variant (of ntcoreffi only) possible?
We need to verify this addresses all the windows symbol issues we've had with protobuf et al, and that the resulting libraries are actually usable on Windows.
I think before we merge this we'd also want to fork the key repos (e.g. bzlmod) currently residing outside wpilibsuite into our org and repoint to them.
We also need a README file showing how to do local build/publish, etc, in the same style as the current Gradle and cmake ones. As a community project, it's very important people know how to build it starting from a git clone.
- I can add debug builds after work
- I'm purposely not building the shared libraries in the MVP, because of the potential problems on Windows. That one will need a thorough inspection by Thad and I didn't want to bog down this pr with that.
- Once it is added they are automatically built by default, no additional command line necessary
- there isn't a great option for maven publishing something like this out of the box. The "default" rule probably works great for Java only, but I think it adds dependencies to the pom file, which we currently do not. It also probably doesn't work for cpp things. I can easily make a script though.
- I'll have to look into ntcoreeffi
/format
How does build caching work with bazel? Does it just use ccache/sccache, or is there something more integrated like Gradle build caches? This is critical for our CI to be reasonably speedy on PRs.
It is a core component of bazel, integrated in the build system. I've been using BuildBuddy for my remote cache solution. You can see in my fork that build times can get down to the ~4 minute mark per os if the changes aren't changing something at the bottom of the dependency tree like wpiutil / wpimath. The same commit took ~20 minutes to build on linux
Note: my fork main line is only building release and not debug, but the majority of that time is spent warming bazel up for a noop build)
Bazel creates a merkel tree for all every action, file, and intermediate object, etc; so a cold, cacheless build will take longer than other build systems so it can build up the cache, but once it has it incremental builds and ones where it can download from the remote server are incredibly fast. I've never had a problem with a cache hit false negative, which recently happened with the gradle cache system
Also worth noting, the cache is always published to the remote server. So that means you can build locally and make a huge change, switch branches and run some builds, and then go back and have the cache ready for you. This also would apply for any CI builds; you don't just have to go off the main branches cache
Well, in our CI case, we would store the cache on Artifactory, and probably still only push to it from main builds. Can you configure levels of caching, so eg local builds could benefit from our cache of main even though you couldn’t push to it?
Is there a reason to only cache main?
It looks like you can set up artifactory to be a remote cache and there are flags you can use to make it selectively read only but I don't really see the point.
We'd cache only main for the same reason GitHub Actions has separate caches for different PRs: preventing cache poisoning.
I could reluctantly get behind the argument that users should be read only, but I think PR's that run on githubs ci machines should be able to write. It would be nice if a /format commit runs in 4 minutes because there is a warm cache to pull from instead of however long the delta from main takes to build.
Cache poisoning is not as much of a problem with bazel in general. Most issues I saw from a google search are super old or using pre bazel 4.0.0
Allowing PRs to share caches still scares me a little (I'm just gonna make a really bad PR and submit it), but PRs are more visible so it's easier to check for potential malicious content, so I'm fine with it.
Practically speaking, we can’t allow caching from a PR, unless we allow our cache to be world-writable, which is a very bad idea. GitHub does not allow secrets to be shared with PRs from forks due to the obvious security risks (a malicious PR could easily extract the secret).