bazel icon indicating copy to clipboard operation
bazel copied to clipboard

Add `$(dirname ...)` location function

Open fmeum opened this issue 6 months ago • 2 comments

Work towards #23516

fmeum avatar Jun 09 '25 09:06 fmeum

@jmillikin FYI, I started to work on this.

fmeum avatar Jun 13 '25 15:06 fmeum

@comius @tjgq I need to add support for nested location functions ($(dirname $(dirname foo/bar/baz))) for this to be really useful. Any concerns?

fmeum avatar Jun 13 '25 15:06 fmeum

@jmillikin This should now be complete, could you take a look and verify that it solves your use cases?

fmeum avatar Jun 29 '25 16:06 fmeum

This solves a different problem than what I have, I think, in that I want a way to expose path.dirname to avoid writing trivial Starlark rules, and this code is adding the equivalent of Go's "path".Dir() function to do text manipulation of path-like strings.

I won't object since the presence of $(dirname) does not preclude the possible addition of $(location_dirname) in the future, but marking the PR as "Fixes #23516" seems inaccurate. It might be better to either have a separate tracking issue for $(dirname), or just merge this without a "Fixes ..." annotation.

jmillikin avatar Jun 30 '25 02:06 jmillikin

@jmillikin This is meant to implement your proposal C from https://github.com/bazelbuild/bazel/pull/23518#issuecomment-2423740537 - I added more tests and edited the RELNOTES to point this out. Does this address your concern?

fmeum avatar Jun 30 '25 08:06 fmeum

The feature I requested in https://github.com/bazelbuild/bazel/issues/23516 is a $(location_dirname) that can be used to resolve the directory of a label. Similar to how $(location) returns a label's path, I wanted a function to return a label's path.dirname.

In the discussion for the implementation you asked for a list of use cases and proposals to address them, which I provided. I tried to provide a reasonably good survey of the various ideas people had come up that were related to the issue of directory names passed as arguments to genrules (etc). The original feature request would have been solved by proposal A.

This PR implements a different function from the one I requested. That's fine, lots of things people will want to do in their builds. But since it solves a problem that is different from the one I care about, I don't think the PR should be marked as solving my feature request.

I've kept the feature request issue open in the hopes that $(location_dirname) might still happen one day. If you view $(location_dirname) and $(dirname) as duplicative, then my feature request can be closed as "not planned".

jmillikin avatar Jun 30 '25 09:06 jmillikin

But since it solves a problem that is different from the one I care about, [...]

I understand that you would have preferred the $(location_dirname ...) syntax that was your initial proposal (and also the "proposal A"), but I don't understand why $(dirname ...) doesn't solve your original use case (albeit with slightly different syntax).

Your original example from #23516 would look like this:

load("@bazel_skylib//rules:run_binary.bzl", "run_binary")

run_binary(
    name = "javacc_stage1_out",
    tool = "//bootstrap:javacc_stage1",
    srcs = ["src/main/javacc/JavaCC.jj"],
    outs = [
        "src/main/java/org/javacc/parser/JavaCCParser.java",
        "src/main/java/org/javacc/parser/JavaCCParserConstants.java"
    ],
    args = [
        # vvv New feature used here to specify the output directory vvv
        "-OUTPUT_DIRECTORY:$(dirname $(location src/main/java/org/javacc/parser/JavaCCParser.java))",
        "$(location src/main/javacc/JavaCC.jj)",
    ],
)

It would help me understand your concern if you could share a concrete example that is solved by $(location_dirname ...) but not by $(dirname ...) combined with existing location functions.

Just like $(location_dirname ...) in its originally proposed form, it doesn't support labels resolving to multiple paths, but this could be supported in the future if we can figure out appropriate semantics.

Edit: I modified the implementation so that we could add support for $(dirname $(execpaths ...)) in the future without a breaking change.

fmeum avatar Jun 30 '25 10:06 fmeum

I'm not worried about the call site syntax -- $(location_dirname //foo/bar) and $(dirname $(location //foo/bar)) aren't much different -- but about the behavior.

Describing what $(location_dirname) does is simple, because it exposes an existing property that has a well-specified value. Starlark code that uses someartifact.path.dirname is isomorphic to my proposed $(location_dirname). I think this simplicity of specification is useful for Bazel built-in functions, which can't be versioned like imported Starlark can be.

In contrast, the implementation of $(dirname) as contained in this PR is based on parsing text that's expected to be some unspecified subset of file paths. I can tell from the implementation and the tests that it has rules like "no spaces" and some sort of quoting, but there's no documentation on the syntax and I don't know whether its behavior for any given input will change in the future.

In the tests, there is this code:

    assertThat(expander.expand("--out=$(dirname $(rootpath :file2))"))
        .isEqualTo("--out='other_files/sub dir with spaces'");
    assertThat(expander.expand("--out=$(dirname $(dirname $(rootpath :file1)))"))
        .isEqualTo("--out=other_files");

... so $(dirname) will sometimes do its own quoting? I can't use it directly in an attribute that gets passed to args, it would need to be re-parsed but again I don't know what the syntax is or whether it might change in the future.

If $(dirname) operated on path values and was rendered to a string lazily then $(dirname $(location)) could be the same as $(location_dirname) and I'd be fine with it. Alternatively, if $(dirname $(location)) was parsed specially (disallowing non-$(func) inputs like $(dirname some/path/here) and Bazel would just do out = out.dirname in a loop then that would be fine too.

jmillikin avatar Jul 01 '25 11:07 jmillikin

We can document the special case of $(dirname $(location ...)) in as much detail as possible and we can also guarantee in docs that it will always behave just like a hypothetical $(location_dirname ...) could. I could even write a differential fuzz test to add strong test coverage for this property.

The could here is important: I have been working with the assumption that both $(dirname ...) and $(location_dirname ...) should have the same behavior as the existing location functions when it comes to quoting. $(execpath ...) is not just File.path, it shell quotes certain most special characters, including spaces.

The quoting isn't a consequence of the particular implementation or the choice to replace $(location_dirname ...) with $(dirname $(location ...)), it just matches what all the other functions do. $(dirname ...) explicitly unquotes its argument so that paths returned by $(location ...) and friends are preserved exactly.

The only artificial restriction is to fail on backslashes. That's just a choice on my part to avoid bad surprises for Windows-minded folks and could apply equally to $(location_dirname ...).

If you think that this kind of auto quoting doesn't make sense outside of genrule (and maybe not even there), then I fully agree with you. But introducing a new location function with subtly different quoting behavior doesn't seem like the right way to address that. Instead, we could document a special tag (say "no-location-quoting") that disables quoting on a given target. That probably renders the plural variants of the location functions useless, so it definitely needs more thought and discussion.

fmeum avatar Jul 01 '25 14:07 fmeum

It sounds like the behavior I want / was hoping for -- a way to easily format input/output paths into args without writing a custom rule -- might not be compatible with the constraint of Bash-style quotes for some paths generated by the Make variable functions.

I'll write a comment to that effect on the original feature request and close it.

From the discussion there it seems there are people who do want a way to have a portable built-in version of the UNIX dirname tool, so I'm sure that even though my exact use case isn't supported there will still be plenty of excitement for $(dirname) as you've implemented it (for use in genrule, etc). Other than the Fixes: ... I don't have any opinion about $(dirname) existing.

jmillikin avatar Jul 01 '25 15:07 jmillikin

Wouldn't the combination of $(dirname ...) and some way to avoid the quoting inherent to all location functions fully address your need? If so, then you could file a new issue for this delta.

fmeum avatar Jul 01 '25 15:07 fmeum