Add gencquery to execute configured queries in-build
https://github.com/bazelbuild/bazel/issues/5594 made progress on removing the host JDK dependency, but we still have a dependency on the host_jdk that I don't know how to break.
$ bazel query --output=build //vehicle/simulation:simulation_lite_query
# /home/austin/local/peloton-tech/vehicle/simulation/BUILD:295:1
genquery(
name = "simulation_lite_query",
testonly = True,
scope = ["//vehicle/simulation:simulation_lite_test", "//util:event", "//util:notifier"],
expression = "deps(//vehicle/simulation:simulation_lite_test) intersect set(//util:event //util:notifier)",
)
austin[151624] aschuh-debian-z620 (phil) ~/local/peloton-tech
$ bazel query 'somepath(//vehicle/simulation:simulation_lite_test,@local_jdk//:jdk)' --output=label_kind
cc_test rule //vehicle/simulation:simulation_lite_test
java_binary rule @bazel_tools//tools/test/CoverageOutputGenerator/java/com/google/devtools/coverageoutputgenerator:Main
alias rule @bazel_tools//tools/jdk:jdk
java_runtime rule @local_jdk//:jdk
We started bazel with --javabase=@openjdk_linux_archive//:runtime --host_javabase=@openjdk_linux_archive//:runtime
The goal is to never use the host JDK.
/cc
Unfortunately, this doesn't lend itself to an easy solution.
The problem is that there is a genquery() rule with a java_binary in its scope, and it therefore loads the package with the default target javabase, which is @local_jdk. And if JAVA_HOME points to something that's not a JDK, badness happens.
And that cannot be fixed with a command line option, because bazel query does not have an option to change the javabase it uses.
We could implement gencquery(), if we knew how to, but other than that, I don't have clever ideas.
Or figure something out so that @AustinSchuh does not need genquery(), but that still leaves is with a bug in Bazel...
Would it be possible to have bazel query take the javabase argument(s), and then just set them to the same value(s) as necessary to make bazel build work? It seems like that should be possible, because the values you pass obviously affect the output.
Note that there's been previous discussion around adding similar flags for the C++ rules at #967 that eventually got "use cquery instead" as the end result.
I think that's @lberki 's point. gencquery would be the way to get the flags in.
Yep, the principled alternaive would be to implement gencquery to gain access to the configuration. Unfortunately, that's not easy to do.
/cc @juliexxia @janakdr @gregestren
So, yea, I got around this in a pretty terrifying way. Please tell me how to do it better :)
In bazel, I did:
diff --git a/tools/jdk/BUILD b/tools/jdk/BUILD
index 9339ee1..c11c28a 100644
--- a/tools/jdk/BUILD
+++ b/tools/jdk/BUILD
@@ -73,17 +73,17 @@ java_runtime_files(
alias(
name = "java",
- actual = "@local_jdk//:java",
+ actual = "@openjdk_linux_archive//:java",
)
alias(
name = "jar",
- actual = "@local_jdk//:jar",
+ actual = "@openjdk_linux_archive//:jar",
)
alias(
name = "javac",
- actual = "@local_jdk//:javac",
+ actual = "@openjdk_linux_archive//:javac",
)
# On Windows, executables end in ".exe", but the label we reach it through
And more in both that file. I did similarly bad things to src/main/java/com/google/devtools/build/lib/bazel/rules/java/jdk.WORKSPACE
In our WORKSPACE file, I'm then doing:
new_http_archive(
name = "openjdk_linux_archive",
build_file_content = """
java_runtime(
name = 'jdk',
srcs = glob(['**']),
visibility = ['//visibility:public']
)
""",
sha256 = "f27cb933de4f9e7fe9a703486cf44c84bc8e9f138be0c270c9e5716a32367e87",
strip_prefix = "zulu9.0.7.1-jdk9.0.7-linux_x64-allmodules",
urls = [
"https://mirror.bazel.build/openjdk/azul-zulu-9.0.7.1-jdk9.0.7/zulu9.0.7.1-jdk9.0.7-linux_x64-allmodules.tar.gz",
],
)
Surprisingly, the end result actually works. I'm flipping to the new JDK via --javabase=@openjdk_linux_archive//:jdk --host_javabase=@openjdk_linux_archive//:jdk
Well, if patching Bazel is on the table, that works, but this of course can't be merged -- the target javabase has to default to JAVA_HOME.
Yea... I'll probably take the patch in the short term until there's a proper fix.
re: genCquery
tl;dr: genCquery presents a difficult implementation challenge because a genquery query executes during analysis phase and this poses incremental correctness concerns. We aren't actively working on implementing genCquery right now because of this.
The cquery engine requires information about the build graph structure that isn't usually exposed at analysis time to the classes that handle building individual configured targets, like a genquery rule instance. Having individual nodes of the build graph not know things about the entire graph is a fairly deliberate choice. Opening up that channel of information opens up the possibility of some intense incremental correctness issues re: when nodes need to be updated now that they store information about more than just themselves and their providers. Generally, genCquery would definitely be a non-trivial undertaking and would require some pretty invasive changes to bazel infrastructure to make work. Unfortunately, right now, we can't think of any suitable workarounds to the incremental correctness problem.
At the recent BazelCon, @lberki proposed an approach: introduce a new node into the graph, TransitiveConfiguredTargetValue, which would depend on a configured target, and on TransitiveConfiguredTargetValues for each of the configured target's deps. This would be in close analogy with TransitiveTargetValue. I didn't think of it at the time when talking with him, but the issue with this approach is that ConfiguredTargets don't expose their deps. Thus, the TransitiveConfiguredTargetValue wouldn't be know what to request. Posting this here in case @lberki would like to respond.
How bad would it be to implement TransitiveConfiguredTargetValue by fetching the direct ConfiguredTarget deps via skyframe (i.e.., graph.getDirectDeps())?
Fetching via Skyframe is exactly what @juliexxia was referring to by "information about the build graph structure that isn't usually exposed at analysis time to the classes that handle building individual configured targets." If we went that route, we could just expose the direct deps to gencquery directly, rather than via the TransitiveConfiguredTargetValue indirectly.
Sure, but I was thinking the advantage of TransitiveConfiguredTargetValue would be that you could have a tree of them, so that every gencquery wouldn't have to take direct skyframe dependencies on all transitive configured targets.
No comments, other than that I'd really like that approach to happen :) (with, of course, rewriting regular bazel cquery to use it, too -- no reason to maintain two implementations)
But Lukács, how does your proposal address the graph introspection issue? Is my description above inaccurate? If you don't avoid introspection, then these additional nodes don't do anything. If your proposal includes separating the dependency discovery of configured targets into separate nodes and storing the dependency list in the value, then that's a much bigger proposal.
Oh, sorry, I didn't notice the last sentence that calls out the flaw.
Well, nothing comes for free... dependency discovery is already somewhat separated out (the DependencyResolver we all know and love and also ConfiguredTargetFunction#computeDependencies()). Yes, it would come at the cost of either computing dependencies twice (CPU) or storing the dependencies the the ConfiguredTargetValue (RAM), but I think it's fair to ask people who use gencquery to swallow the CPU cost of computing dependency resolution twice.
I think the bigger issue is the maintenance cost of keeping all dependency declaration in the factored-out module. Maybe we can do it cleanly: if the SkyFunctionEnvironment can be totally hidden from the rest of ConfiguredTargetFunction, then this should be fine. I guess Julie can decide if this is feasible :)
Let's see... apart from error reporting, SkyFunction.Environment is used for two things:
-
To resolve late-bound attributes on split transitions, for which one needs a configuration. I believe this is only used for
android_binary.:cc_toolchain_splitand can be implemented by a regular split transition that points to@bazel_tools//tools/cpp:current_cc_toolchain. -
For mapping a direct dependency to a target (in
#getTargets()). I think that either needs to say or need to be passed in a label-to-target map somehow.
DependencyResolver is already an abstract base class, so one could also argue that the code is already factored out.
Thinking again, what I said in the previous comment is absolutely true, but I now don't understand why dependency resolution would be an issue. The configured equivalent of TransitiveTargetValue (say, TransitiveConfiguredTargetValue) would be created by a SkyFunction, therefore, it can just reuse SkyframeDependencyResolver without much ado. That returns an attribute-to-dependency map, which is exactly what TransitiveConfiguredTargetValue needs.
Did this get anywhere? We are sooo close to no longer needing to patch Bazel when upgrading. I think this is our last issue.
No, unfortunately, we haven't made any progress since the end of October :(
Moving to the Core team since this isn't specific to Java and is rather an issue with bazel query not being very smart.
Is this issue still open? I have tested and don't see errors about a missing @local_jdk when doing bazel query, but there's a lot more detail in the comments. Is this issue even about local JDK and query anymore?
Did anyone say there were errors? I thought OP just never wants @local_jdk to appear in a query result. :)
It's true this issue did sort of morph into us speculating on how to implement a gencquery rule...
There definately were errors. I uninstalled the JDK from the machine in question, set JAVA_HOME to /dev/null, and got an error. (At the time, and @bsilver8192 can correct me, too new a local JDK would also trigger a very fun error for all the same reasons. That's how we started down this path, trying to never ever reference the local JDK because developers would go break it)
It will take me a bit of work to confirm that this is indeed still broken, but I very much suspect it still is. @lberki , do you happen to know?
Testing in docker, it seems that a java_library does have a dependency on @local_jdk, although I didn't bother tracing the path, so this should still be open (and I don't know if anyone is still working on it).
I think I was confused by the conversation about gencquery. Thanks for helping to clarify.
Maybe this should be closed and a different issue should be opened about why java_library depends on local_jdk? I think btw this is for large targets and I’ve already opened an issue for it. I’ll try to find the issue
This issue should probably be closed and forked: one for java->@local_jdk dependencies, and one for gencquery.
I plan to wait until tomorrow in case any further comments arise, and then handle that.
Since there is a lot of good info here on the gencquery, can we leave this one open and clarify rather than closing and creating 2 new ones?
FWIW I do think making gencquery a reality is still on the table. I think the next step on that is getting clear guidance from Bazel core folks (i.e. Skyframe maintainers) on what kind of new API punctures into Skyframe they find acceptable.
Okay, renaming this and forking just the local_sdk issue as #10468.