bazel Accessing aspects from cquery Starlark output

Accessing aspects from cquery Starlark output

Open aherrmann opened this issue 3 years ago • 8 comments

Description of the feature request:

Make providers returned by aspects requested on the command-line available to cquery Starlark output.

Feature requests: what underlying problem are you trying to solve with this feature?

I would like to process information, that is extracted from a target by an aspect, in a cquery using the Starlark output mode.

In particular, I am trying to access the value of a rule attribute that is not accessible through any of the providers returned by the rule itself.

For a minimal repro:

Checkout https://github.com/aherrmann/bazel-query-aspects

Run

$ bazel cquery \
  --include_aspects=True \
  --aspects=//:defs.bzl%my_aspect \
  //... \
  --output=starlark --starlark:expr '"CQUERY {}: {}".format(target.label, providers(target).keys())'

Output

DEBUG: .../defs.bzl:20:10: ASPECT struct(content = "foo")
CQUERY //:rule: ["FileProvider", "FilesToRunProvider", "OutputGroupInfo"]

Observe that the aspect is executed, but the returned provider MyProvider is not listed in the cquery output.

What operating system are you running Bazel on?

Ubuntu 21.04

What's the output of `bazel info release`?

$ bazel info release
release 4.1.0

Aug 18 '21 13:08 aherrmann

I'm a bit surprised that doesn't already work. But cquery has to do some special processing to handle aspects, so maybe the Starlark output logic missed that. For example https://github.com/bazelbuild/bazel/commit/5a4be74d99af5885a99ef9e9543d5adac8491d93.

CC @aiuto, who may have better insights.

Aug 25 '21 21:08 gregestren

Whoops. Lost this. I think this is non-obviously hard. IIRC, I've noticed that the aspect evaluation on the command line don't all get evaluated before the target evaluation completes. So, I think there is a deficiency here, but I don't have a quick fix.

One of the workarounds I have used is to have the aspect dump results to a file and examine that. In a sense, it is not quite as flexible as cquery on the command line. You would have to edit the print routine in .bzl code from a file. I don't consider that a disadvantage, as the better way to use cquery --output=starlark is with --starlark:file instead of expr. Using expr and iterating on the query you want keeps dumping the analysis cache. Using a file does not dump the cache. This saves me a lot of debug time.

Nov 10 '21 04:11 aiuto

One of the workarounds I have used is to have the aspect dump results to a file and examine that.

Yes, that's also the workaround that we are using.

Using expr and iterating on the query you want keeps dumping the analysis cache. Using a file does not dump the cache. This saves me a lot of debug time.

Yes, for an interactive use case that is a good point. Our use case is not an interactive one, but part of our CI, so this seems less of a concern.

For reference I'll describe that use-case here. This goes a few shaved yaks down, but I'll try to keep it brief:

We use rules_scala's scala_test_suite macro which generates one scala_test per scala source file in srcs and bundles them in a test_suite. scala_test_suite generates names for the scala_test based on the source file path on MacOS and Linux, but uses a numbering scheme on Windows instead to avoid exceeding MAX_PATH.
We run our CI on Linux, MacOS, and Windows. And we collect the build event stream and profile for reporting and analysis purposes.
We want to analyze how the same test behaves across platforms, e.g. if its runtime increased over time, or it became flaky. So, we want to make sure that test names are identical on all OSs.
To achieve this we have an aspect that we apply to all scala_test targets to generate a mapping from the shortened name on Windows to the long name on Linux and MacOS. We can then use this mapping to normalize the data.

That last point is where we want to extract data produced by an aspect in a structured format across all relevant targets. Having to go through intermediate output files just adds a bit of complexity to this use case, because we now have to find all the files that were produced and then collect the data from all those files. The aspect is defined here and it is called here where we then collect the data from all the generated files.

Nov 10 '21 17:11 aherrmann

Thanks for the deeper details. That leads to two distinct replies.

I totally see the need here. I am facing it myself in a different situation and do not have any good answers right now.
I have an idea that might work for you. Your aspect appears to be easily transformable directly into a cquery expression. Instead of having the tests to see if the test rule is eligible and then returning either [] or writing the pair of short & long label, what if the expression just returned nothing for ineligible things and returned the pair for the rest? Instead of running the aspect, it would just be a bigger query.

Nov 10 '21 18:11 aiuto

2. Your aspect appears to be easily transformable directly into a cquery expression.

As I recall the issue that kept me from going that way was that I need access to the srcs attribute of the scala_test. I need the source file path to reconstruct the long name on Windows. In the cquery Starlark expression I have access to the Target and its providers, but I did not find a way to gain access to the attributes of the rule that defines the target. That's what led me down the aspect route, since aspects also have access to the rule's attributes.

Nov 11 '21 09:11 aherrmann

@gregestren Do you think we can find cycles to bump this to P2 and execute on it during Q1/Q2? It might tie in with the attempt to parallelize analysis of explicit targets and aspect evaluation. IIRC, there was a thread about that in November/December.

Feb 19 '22 21:02 aiuto

I don't remember that thread.

I'm not sure we have a consensus on the nature of the challenge. I suggest that as a next step to vet what we can realistically expect to achieve.

This also reminds me of https://github.com/bazelbuild/bazel/issues/14799#issuecomment-1050450128 - separate issues with cquery that both are IMO caused by aspects "kind of implicitly being processed behind the scenes" vs. being explicit output nodes. i.e. the graph cquery shows to the user doesn't match the actual graph which had nodes for both configured targets and aspects. That mismatch causes complication and weirdness.

So there's evidence cquery + aspects will continue to struggle if we don't address that root cause. That, I think, is a not-tiny but not HUGE design change to cquery, including user-visible changes.

Another possible approach is to try to leverage MergedConfiguredTarget - which contains both a target and its attached aspects. I'm not sure that'd actually be accessible to cquery though (I'm not sure that's retrievable via a Skyframe call).

So yeah, I can imagine this as a Q2 candidate. We'd just want to start with some investigatory work to ensure it scopes well and the ideas above are viable.

Mar 03 '22 00:03 gregestren

FYI I prototyped some code last week for cquery to output aspects as first-class citizens (i.e. both configured targets and aspects appear as peers with clear labels identifying each). That's the "root cause" fix I mentioned above. As expected it's a pretty code-invasive change. So it's basically experimental code now. If I can get some more confidence in it I'll see if I can make it more accessible.

Sep 19 '22 19:09 gregestren

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

Nov 24 '23 01:11 github-actions[bot]

Aspect support is being tried: https://github.com/bazelbuild/bazel/issues/16310#issuecomment-1830745008.

Nov 28 '23 21:11 gregestren

bazel bazel copied to clipboard

Accessing aspects from cquery Starlark output

Description of the feature request:

Feature requests: what underlying problem are you trying to solve with this feature?

What operating system are you running Bazel on?

What's the output of bazel info release?

bazel
bazel copied to clipboard

What's the output of `bazel info release`?