bazel icon indicating copy to clipboard operation
bazel copied to clipboard

Add dirname variants of predefined source/output path variables.

Open jmillikin opened this issue 1 year ago • 5 comments

These are similar to existing $(execpath) and friends, but return the path of the artifact's directory rather than of the artifact itself.

Fixes https://github.com/bazelbuild/bazel/issues/23516

jmillikin avatar Sep 05 '24 05:09 jmillikin

@illicitonion Didn't you file an issue for this a while ago? Can't find it right now, but it also had relevant discussion.

fmeum avatar Sep 05 '24 12:09 fmeum

Nvm, found it: https://github.com/bazelbuild/bazel/issues/23139

fmeum avatar Sep 05 '24 12:09 fmeum

any idea who we should ping to review this one?

keith avatar Oct 18 '24 23:10 keith

Reviewing this change is probably quite easy compared to validating its design choice (@tjgq suggested an alternative above, https://github.com/bazelbuild/bazel/issues/23139 has another one).

I would personally benefit from a short write-up that lists the use cases and how each of the proposed APIs would address them (or not).

fmeum avatar Oct 19 '24 07:10 fmeum

Proposal A: $(execpath_dirname //some:label)

  • Returns the directory path of a single-file label, equivalent to File.dirname.
  • As proposed, it doesn't work for multi-file labels.
    • There would be no semantics problem if restricted to cases where all files identified by the label are in the same directory, but that also limits its utility.
    • Unclear semantics if allowed for labels that have multiple files in different directories, what path should be returned for a target filegroup(srcs = ["a/b", "a/c", "a/d/e"]) ?

Proposal A-2: $(execpath_dirname //some:multi-file-label)

  • Returns the unique set of directory paths for all files in the label, similar to the dirname shell command.
  • Unclear what the user would do with such an output, since usually there needs to be some interpolation (like adding -isystem between paths to include files' directories)
  • Could have an API somewhat like Args: $(execpath_dirname "-iquote" //some:label) might produce "-iquote a/b -iquote c/d" but that's blurring the line pretty badly between make variables (conceptually simple) and Starlark (... generally not considered quite as simple).

Proposal B: $(execpath //some/package:__pkg__)

  • Returns the path to a package, equivalent to $(execpath_dirname //some/package:BUILD).
  • The provided example is obtaining the path to a third-party C library's include/ directory for computing copts=, but $(execpath ...) is probably not the right approach for that anyway because it ignores _virtual_includes and would break for labels containing both source files and generated files.
  • Doesn't work with aliases.
  • Generally with trying to inspect paths for Files the thing the user wants is related to the File, not its generating package.

Proposal C: $(dirname $(execpath //some:label))

  • Instead of $(execpath_dirname ...) and $(rootpath_dirname ...), the $(dirname ...) function generically allows the directory name to be extracted from File paths.
  • Implementation would require either reparsing the path (bad), or changing $(execpath) and friends to return Path instead of String (and plumb that through various layers).
  • If $(execpath //:foo) returns foo.txt, what does $(dirname $(dirname $(execpath //:foo))) return? Is the answer the same on different platforms?
    • If it returns "." then $(dirname ...) is implicitly scoped to the build root. Is it allowed to generate paths that are "in between" the build root and the artifacts, such as "bazel-bin/" from being called on a generated file more times than its path has components?
    • If it returns ".." then $(dirname ...) is allowed to generate arbitrary paths outside the build root, which would be surprising behavior for a make-variable builtin function.

Proposal D: $(execpath-1 //some:label)

  • Allows more concise notation for nested $(dirname ...), for example $(dirname $(dirname $(dirname //some:label))) could be written $(execpath-3 //some:label).
  • Same benefits (more combinator-like API) and drawbacks (unclear semantics if the result would be a valid filesystem path but not a valid Bazel path).
  • The word dirname doesn't make an appearance, and the -N notation for stripping path components isn't widespread enough for users to guess its purpose without a deep inspection of the make variable docs.

Proposal E: $(user_defined_execpath_dirname //some:label)

  • Similar to how make variables can be injected into a target via the toolchains= parameter, allow Starlark functions to be injected.
  • A user who has complex make-variable expansion needs could write execpath_dirname.bzl that has execpath_dirname = make_variable_function(_execpath_dirname, args = [attr.label()]) and returns it via a TemplateVariableInfo.
  • Fully self-service for authors of build rules (and rulesets). rules_cc could provide things like $(cc_iquote_include_dirs //some:label).
  • Much more API surface, more difficult to implement (requires lots of plumbing).
  • It would be nice if common use cases such as "dirname of a single-file label" could be satisfied without the user having to learn a new corner of the extension API.

jmillikin avatar Oct 19 '24 10:10 jmillikin

Gentle ping @fmeum Was the above list of use cases and proposals what you were looking for, or would you prefer a more formal design doc -style writeup?

jmillikin avatar Nov 20 '24 12:11 jmillikin

This is what I (personally) have been looking for. It's very rich in information, thanks!

Regarding $(dirname ...): Since paths obtained via location expansion are always slash-separated, I wonder whether we could restrict this to operating on slash-separated paths as well. If we then cap it off at . for the execroot (failing if used again), I could see this being a pretty versatile approach. It would still be limited to single-file labels.

My favorites are A and C, but I don't know which of them I'd prefer.

fmeum avatar Nov 28 '24 10:11 fmeum

@matts1 added ways to represent directories to bazel-skylib, it'd be really nice if we could ensure whatever path we take plays well with that.

armandomontanez avatar Dec 02 '24 17:12 armandomontanez

I've worked around a need for this feature before. The main use case I've had for this has been when a tool's behaviour to find another tool or resource is to look for it in the PATH, so I need to configure the PATH environment variable to contain the directory of that tool's path. Proposal A seems the most easy to understand and use but the flexibility of E is somewhat attractive too.

mentalnote avatar Dec 22 '24 03:12 mentalnote

Gentle ping

jmillikin avatar Feb 20 '25 10:02 jmillikin

@jmillikin Not sure whether you intentionally deleted your fork and don't want to continue to work on this, but I'm now firmly in favor of approach C (which aligns with @tjgq's suggestion in https://github.com/bazelbuild/bazel/pull/23518#pullrequestreview-2282928393).

More precisely, I would suggest making it so that $(dirname ...):

  • operates on forward slash separated paths, regardless of the host system.
  • does not give special meaning to bazel-out/... paths and thus supports traversing up to paths such as ., bazel-out and bazel-out/k8-fastbuild from the path of a generated file.
  • fails with an error when it would be applied to a path without a known parent (the empty string, ., .. etc.)
  • possibly (not sure about this part) never returns a path with a .. segment, that is, it always cleans the resulting path.

What do you think about that? My reasons for preferring this approach are that it introduces a single function that composes well with all other functions and that it matches precisely what one would do in shell scripts.

@matts1 @armandomontanez Could you check which of John's proposed new location functions would work well with Skylib's directory primitives? I don't think that anyone in this thread apart from you knows them well enough to be certain about this.

fmeum avatar Feb 24 '25 11:02 fmeum

GitHub has so many interesting ways to accidentally and irrevocably change PR state.

The branch link up at the top (jmillikin:location-dirname) continues to work, but that code is for proposal A.

I'm having a difficult time understanding the semantics you describe (would $(dirname ...) on Windows really only operate on forward slashes...?), so someone with a clearer view of how that would work should probably take the lead on implementation.

jmillikin avatar Feb 24 '25 12:02 jmillikin

@matts1 @armandomontanez Friendly ping

@jmillikin Yes, I actually think that handling slash-separated paths only also on Windows is the only behavior that actually makes sense: all Make variables and location expansions produce Unix paths and whether a particular path in a BUILD file will be "executed" on a Windows or Unix machine can't really be determined.

fmeum avatar May 30 '25 17:05 fmeum

Ah, missed this, sorry.

My work essentially allows you to write target[DirectoryInfo].path in rules. I don't think there's much overlap here with this work.

You could, of course, special case targets with DirectoryInfo when doing $(location) or $(location_dirname), but I don't think that's happening, since even if it was a good idea (which I don't believe), DirectoryInfo isn't native to bazel - it's in skylib.

matts1 avatar Jun 02 '25 02:06 matts1