remote-apis icon indicating copy to clipboard operation
remote-apis copied to clipboard

build.bazel.remote.execution.v2.Command.arguments[0] should specify a path separator.

Open rubensf opened this issue 3 years ago • 5 comments

Motivation: https://github.com/bazelbuild/bazel/issues/11636#issuecomment-651154740

Throughout the API, both for inputs (in SymlinkNodes) and in outputs, we specify that separators should be single forward slashes (/). However, we're not explicit on the Command arguments, which currently reads:

  // The arguments to the command. The first argument must be the path to the
  // executable, which must be either a relative path, in which case it is
  // evaluated with respect to the input root, or an absolute path.
repeated string arguments = 1;

The issue here comes from Windows cmd.exe not being able to naively recognize relative paths using / separators as executables, eg:

(cmd) $ foo/bar.bat
'foo' is not recognized as an internal or external command, operable program or batch file.

While there are many ways for the server to go around that restrictions (eg converting / -> \, wrapping the command in quotes, etc), we should it explicit whose responsibility it is to deal with this:

  1. The server can responsible to process the path according to the executing platform;
  2. The client can be responsible to ensure that argument[0] is executable without any additional processing;
  3. We could specify that the separator should be the platform's default;
  4. Technically, we could make it that either / or \ are valid, though I find little benefit to this approach, as it's a more ambiguous than 1 but forces the server to apply processing anyway (and maybe there exists a bizarre scenario where both are valid and yield different results?)

I'm personally prefer 1. as the server may be already doing other processing on the command and figuring out the platform specific separator shouldn't be challenging, but I don't have a strong preference here as the approach is explicitly stated.

rubensf avatar Mar 09 '21 05:03 rubensf

I think that solution 2 is the only logical one. Solution 1 would only make sense if you only call into programs inside the input root. It’s perfectly valid to also call into programs installed on the system: C:\Program Files\foo. Using forward slashes there would be really weird.

Furthermore, this problem also applies to successive arguments. Those most likely need backslashes as well. It would be inconsistent if argv[0]’s conventions differ from the rest.

EdSchouten avatar Mar 09 '21 06:03 EdSchouten

+1 to Ed - we could standardize arg0 in some way perhaps, but we cannot safely standardize anything about the remainder - we don't actually know if an argument is a path (to be transformed) or a string that happens to contain slashes (which servers should not touch in any way). And different tools will have different support for paths, especially on Windows. (UNC paths, etc). And at that point, it seems best to just say that the client is responsible for providing OS-specific and tool-understandable paths/arguments for all arguments, including arg0.

One wrinkle to that might be if cmd.exe adds any specific constraints that aren't "Windows standard" - my preference for documentation would be something along the lines of "args must be formatted correctly for the OS and for the tool they'll be passed to", but if there are ways to comply with that but also not be amenable to passing to cmd.exe directly, I guess that's on our server to wrap/transform as necessary anyways to bridge the gap.

On Tue, Mar 9, 2021 at 1:22 AM Ed Schouten [email protected] wrote:

I think that solution 2 is the only logical one. Solution 1 would only make sense if you only call into programs inside the input root. It’s perfectly valid to also call into programs installed on the system: C:\Program Files\foo. Using forward slashes there would be really weird.

Furthermore, this problem also applies to successive arguments. Those most likely need backslashes as well. It would be inconsistent if argv[0]’s conventions differ from the rest.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/remote-apis/issues/187#issuecomment-793449017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABREW4HJZE2UWHBKXFCF23TCW5A5ANCNFSM4Y22JH4A .

EricBurnett avatar Mar 09 '21 16:03 EricBurnett

+1 to option 2. The client already has some knowledge (or makes assumptions) about what the remote environment looks like--for example, that executable /foo/bar/bas will exist in the remote environment--and this feels similar. I also agree that the problem of universally processing argv[n] is not solvable by the server, and that consistency is valuable here.

On Tue, Mar 9, 2021 at 11:44 AM Eric Burnett [email protected] wrote:

+1 to Ed - we could standardize arg0 in some way perhaps, but we cannot safely standardize anything about the remainder - we don't actually know if an argument is a path (to be transformed) or a string that happens to contain slashes (which servers should not touch in any way). And different tools will have different support for paths, especially on Windows. (UNC paths, etc). And at that point, it seems best to just say that the client is responsible for providing OS-specific and tool-understandable paths/arguments for all arguments, including arg0.

One wrinkle to that might be if cmd.exe adds any specific constraints that aren't "Windows standard" - my preference for documentation would be something along the lines of "args must be formatted correctly for the OS and for the tool they'll be passed to", but if there are ways to comply with that but also not be amenable to passing to cmd.exe directly, I guess that's on our server to wrap/transform as necessary anyways to bridge the gap.

On Tue, Mar 9, 2021 at 1:22 AM Ed Schouten [email protected] wrote:

I think that solution 2 is the only logical one. Solution 1 would only make sense if you only call into programs inside the input root. It’s perfectly valid to also call into programs installed on the system: C:\Program Files\foo. Using forward slashes there would be really weird.

Furthermore, this problem also applies to successive arguments. Those most likely need backslashes as well. It would be inconsistent if argv[0]’s conventions differ from the rest.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/bazelbuild/remote-apis/issues/187#issuecomment-793449017 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AABREW4HJZE2UWHBKXFCF23TCW5A5ANCNFSM4Y22JH4A

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/remote-apis/issues/187#issuecomment-794143759, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADMU23ZH7I5K62FQVUHZ52TTCZF7JANCNFSM4Y22JH4A .

bergsieker avatar Mar 09 '21 17:03 bergsieker

I intended option 1 to apply exclusively to argv[0], since it should always (?) be a non-path separated function or command in the $PATH, or be the canonical path to an executable.

Maybe it also makes sense to define some reasonable entry point on the server side? The user or the server could have weird setups (eg a user specified docker container with a cygwin bash entrypoint) that might be incompatible with the OS default (eg cygwin). It'd be unreasonable for bazel as a client tool to make those assumptions?

rubensf avatar Mar 12 '21 22:03 rubensf

The client also specifies the docker container, and that should allow it to make assumptions about the contents thereof.

ulfjack avatar Feb 08 '22 15:02 ulfjack