bazel icon indicating copy to clipboard operation
bazel copied to clipboard

Runfiles: support paths with spaces

Open laszlocsomor opened this issue 7 years ago • 43 comments

Update: this bug no longer claims to be Windows-specific, see https://github.com/bazelbuild/bazel/issues/4327#issuecomment-468274037


Fix this TODO: https://github.com/bazelbuild/bazel/blob/7b423ccd9506c6fb500b5c4998e1f26aebf28912/src/main/java/com/google/devtools/build/lib/windows/runfiles/WindowsRunfiles.java#L54-L55

laszlocsomor avatar Dec 19 '17 13:12 laszlocsomor

Hmm, this problem also applies to other platforms as well, say macOS, when running bazel build //... on a workspace living under a directory with name containing space(s), bael will tell me:

ERROR: bazel does not currently work properly from paths containing spaces (com.google.devtools.build.lib.runtime.BlazeWorkspace@5dc63968).

izzyleung avatar Dec 19 '17 19:12 izzyleung

@izzyleung thanks for the info! In that case I think P2 or even P3 is safe. I thought this worked on other platforms.

laszlocsomor avatar Dec 20 '17 08:12 laszlocsomor

Update: I haven't looked into fixing this issue and I'm most likely not going to for the next couple months.

laszlocsomor avatar May 16 '18 12:05 laszlocsomor

Dropping priority. Spaces aren't well supported on Linux either. It'd be nice if they were fully supported, but this doesn't seem like a critical feature.

Demo time:

Directory structure:

  $ find .
.
./WORKSPACE
./sub
./sub/a b
./sub/a b/bar
./sub/a b/bar/x.txt
./bin.sh
./BUILD

BUILD file:

sh_test(
    name = "bin",
    srcs = ["bin.sh"],
    data = glob(["sub/**"]),
)

genrule(
    name = "gen",
    outs = ["gen.txt"],
    srcs = glob(["sub/**"]),
    cmd = "( for f in $(SRCS); do echo $$f; done ; ) > $@",
)

bin.sh:

#!/bin/bash
echo "Hello test"
find .
false

Space aren't supported in data dependencies:

  $ bazel build :bin
INFO: Analysed target //:bin (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /tmp/aaaaaaaa.aaa/runfile_spaces/BUILD:1:1: Creating runfiles tree bazel-out/k8-fastbuild/bin/bin.runfiles failed: build-runfiles failed: error executing command 
  (cd /usr/local/google/home/laszlocsomor/.cache/bazel/_bazel_laszlocsomor/b8ecdd60a1b4498304771172bdb140cf/execroot/__main__ && \
  exec env - \
    PATH=/usr/local/google/home/laszlocsomor/bazel-releases/current/bin:/usr/local/google/home/laszlocsomor/bazel-releases/current/bin:/usr/lib/google-golang/bin:/usr/local/buildtools/java/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  /usr/local/google/home/laszlocsomor/.cache/bazel/_bazel_laszlocsomor/install/4cfcf40fe067e89c8f5c38e156f8d8ca/_embedded_binaries/build-runfiles bazel-out/k8-fastbuild/bin/bin.runfiles_manifest bazel-out/k8-fastbuild/bin/bin.runfiles): Process exited with status 1: Process exited with status 1
/usr/local/google/home/laszlocsomor/.cache/bazel/_bazel_laszlocsomor/install/4cfcf40fe067e89c8f5c38e156f8d8ca/_embedded_binaries/build-runfiles (args bazel-out/k8-fastbuild/bin/bin.runfiles_manifest bazel-out/k8-fastbuild/bin/bin.runfiles): link or target filename contains space on line 3: '__main__/sub/a b/bar/x.txt /tmp/aaaaaaaa.aaa/runfile_spaces/sub/a b/bar/x.txt'

Target //:bin failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.187s, Critical Path: 0.01s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

Spaces don't play well with genrule's $(SRCS).

  $ bazel build :gen && cat bazel-genfiles/gen.txt
...
INFO: Build completed successfully, 1 total action
sub/a
b/bar/x.txt

laszlocsomor avatar Feb 28 '19 13:02 laszlocsomor

Adding team-Core as it is crosscutting through Skyframe-related things

dslomov avatar Jul 23 '19 09:07 dslomov

This issue is cropping up when I try to use an in-build py_runtime. Assuming I have a way of acquiring a python installation (I'm using conda in my case), it so happens that it will have filenames with spaces in them (for pretty core packages like setuptools).

Here's an example:

BUILD

py_runtime(
    name = "py3_runtime",
    files = ["@conda//:all_files"],
    interpreter = "@conda//:python",
    python_version = "PY3",
)

py_runtime_pair(
    name = "py_runtime_pair",
    py3_runtime = ":py3_runtime",
)

toolchain(
    name = "py_toolchain",
    toolchain = ":py_runtime_pair",
    toolchain_type = "@rules_python//python:toolchain_type",
)

@conda//BUILD

package(default_visibility = ["//visibility:public"])

exports_files(["env"])

filegroup(
    name = "all_files",
    srcs = glob(["env/**"]),
)

filegroup(
    name = "python",
    srcs = ["env/bin/python"],
)

I have a conda-based repository_rule which fetches the conda environment into the env directory of the @conda repository.

OK, now let's run something, we get the following error:

ERROR: /***/tools/analysis/BUILD:25:1: Creating runfiles tree bazel-out/darwin_x86_64-opt/bin/tools/analysis/analysis_test.runfiles failed: build-runfiles failed: error executing command
...
/var/tmp/***/install/8772b695a4a7b498a4dd3eff54505d9c/_embedded_binaries/build-runfiles: link or target filename contains space on line 20923: '***/external/conda/env/lib/python3.7/site-packages/setuptools/script (dev).tmpl /private/var/tmp/***/79803ee528d2751e2a32d6192f74c66d/external/conda/env/lib/python3.7/site-packages/setuptools/script (dev).tmpl'

My workaround at the moment is to exclude filenames with spaces in them in the all_files filegroup, but this means I'm getting some partial python environment.

flaub avatar Aug 30 '19 01:08 flaub

I am running into this problem, and in my case, an easy fix (I think) is to change the runfiles manifest file format to use some character other than space as a separator. I briefly tested this (using TAB as separator) and it seems to be working, but I didn't test in much depth since I wanted to ask here first if such a change would be acceptable. (The main challenge was bootstrapping Bazel with a change to the manifest file format, but I got that to work, too.)

Would it be OK to change the separator in manifest files to another character? If so, what character? A single-byte character other than NUL or \n would be the smallest change.

Keeping a character reserved would remain a limitation, and would be a regression for anyone who depends on that character to work; but in practice, it would be a step forward since spaces in filenames are much more common than TAB or 0xff or whatever separator we choose.

ohler avatar Sep 08 '19 03:09 ohler

We could use a different character than space, and adjust code that reads manifest files to be backwards-compatible, i.e. check for the new character first, then fall back to splitting on space.

laszlocsomor avatar Sep 09 '19 07:09 laszlocsomor

More things rely on the current manifest format than I anticipated:

  • build-runfiles.cc
  • runfiles libraries (@bazel_tools//tools/<lang>/runfiles)
  • Bash runfiles library init code, which is copy-pasted wherever used
  • diff_test rule in Skylib: https://github.com/bazelbuild/bazel-skylib/blob/90ea6feaf33e8ef12fdac9981b2efccc6a25c727/rules/diff_test.bzl#L41 character 61, and https://github.com/bazelbuild/bazel-skylib/blob/90ea6feaf33e8ef12fdac9981b2efccc6a25c727/rules/diff_test.bzl#L49 character 61
  • rules_nodejs stub: https://github.com/bazelbuild/rules_nodejs/blob/efae65c0e83d7d236c6b0f3b62a7a4f6ffc9cbfe/internal/common/windows_utils.bzl#L50, character 93

If we updated the manifest format, the change should be as little disruptive as possible. Preferably not break anything, i.e. code that expects the current format should still work.

I think we could keep space as the separator and use another character only when an entry has space in it. We could then update manifest consumers one by one. Old consumers would still work with new Bazel as long as no file has spaces in them (and so the delimiter is space), but they would break with files names with spaces because they'd split on the first space and not the alternative delimiter, but that's OK because they fail today already anyway.

laszlocsomor avatar Sep 09 '19 09:09 laszlocsomor

Thanks for working on this! That compatibility feature of leaving manifest entries without spaces in the paths unchanged seems like the easiest way to get Bazel bootstrapping to work (when I was testing, I got the impression that, during bootstrapping, Bazel produces manifests with one version and then reads them with another; but I could be mistaken).

An alternative approach that is also compatible would be to replace embedded spaces with the new reserved character. That way, existing code unprepared for the new format would still find the right split points (since the delimiter would remain a space), and merely have the wrong paths. Making sure the top-level structure doesn’t get misinterpreted would make things a little easier to reason about. (A concrete benefit is that error messages from code that doesn’t understand the new format will perhaps be less confusing; although perhaps it’s also obvious enough if paths get truncated at the first space.)

One advantage of introducing a new separator (as in your PR) is that we can eventually switch over to using that exclusively, whereas replacing spaces would remain more complex. However, if we don’t plan to do phase out spaces as separators (and instead want to preserve compatibility indefinitely), I would lean towards replacing spaces.

But also: Have you considered just biting the bullet and replacing spaces with an escape sequence like \40? That gives us an obvious way to allow any character.

ohler avatar Sep 09 '19 18:09 ohler

Thanks for the suggestions!

An alternative approach that is also compatible would be to replace embedded spaces with the new reserved character.

Interesting idea!

  • Pro: consistency, the first space on the line is always the delimiter.
  • Con: must replace space with special character on every manifest line when writing the manifest, and every time we look up paths in the manifest.

Current PR's approach is using the special delimiter only if the link path has space:

  • Pro: no path escaping necessary
  • Con: more complicated file syntax, need to look up paths twice (for the two delimiters)

Your idea seems like the better approach overall, the convincing argument is the single lookup -- I adjusted the PR, let's see if tests pass.

laszlocsomor avatar Sep 10 '19 14:09 laszlocsomor

If anyone's following: I'm working on this bug now.

laszlocsomor avatar Sep 18 '19 07:09 laszlocsomor

Can the "Runfiles: support space in file paths" https://github.com/bazelbuild/bazel/pull/9351 PR be resurrected?

I ran into this again in rule_nodejs. Some npm packages contain files with spaces and these are excluded from runfiles as it is not currently supported by Bazel.

Angular team wants to use puppeteer to manage the chrome binary version for web testing in their repo. This npm package downloads and extracts the chrome binary to node_modules/puppeteer/.local-chromium and some of the files for chrome on OSX contain spaces. For example, node_modules/puppeteer/.local-chromium/mac-706915/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Versions/79.0.3945.0/Chromium Framework. I'm thinking of work-arounds to this but the simplest solution would be for Bazel to support paths with spaces in runfiles.

gregmagolan avatar Jan 28 '20 14:01 gregmagolan

WindowsRunfiles doesn't exist anymore.

meisterT avatar May 12 '20 09:05 meisterT

This is not windows specific. It's happening on linux as well.

cristifalcas avatar May 12 '20 12:05 cristifalcas

The flag --experimental_inprocess_symlink_creation may provide a workaround for this issue. The main caveat is that it can't recover from a file or directory with bad mod bits (compared to the tool-based strategy). I also haven't looked at its performance implications (it may be slower).

Edit: Correct the flag name.

ulfjack avatar May 12 '20 15:05 ulfjack

The flag seems to work well from my experiments with it so far! I'll write about any incompatibilities I find along the way

mrmeku avatar Jun 15 '20 17:06 mrmeku

@gregmagolan I am facing the same issue with Electron on OSX (it uses Chromium under the hood, so it is just the same problem). Did you found a workaround I can take inspiration from?

marmos91 avatar Jun 30 '20 10:06 marmos91

@marmos91 Did you see the last two updates on this bug?

ulfjack avatar Jun 30 '20 12:06 ulfjack

@marmos91 Did you see the last two updates on this bug?

@ulfjack do you mean the flag --experimental_inprocess_symlink_creation? If so, I tried it without luck :(

marmos91 avatar Jul 02 '20 10:07 marmos91

Can you say what happened when you tried it?

ulfjack avatar Jul 02 '20 11:07 ulfjack

I have just run into this writing a sanity test for a windows installer using py_test and a data dependency on the install fileset. In my case the bazel root doesn't contain spaces, but the file set under test does courtesy of Program Files.

An example line:

myapp/Installer/package/program files/mycompany/my.exe C:/dev/_bazel/xxxxxx/execroot/myapp/bazel-out/x64_windows-fastbuild/bin/Installer/package/program files/mycompany/my.exe

I have tried --experimental_inprocess_symlink_creation when I look at the runfiles dictionary via

r = runfiles.Create()
print(r._strategy._runfiles)

I can see the last entry in the manifest split on the first space after program before files. So that option doesn't fix the problem for me. I would see this: {"myapp/Installer/package/program" : "files/mycompany/my.exe"}

I notice that if the workspace root doesn't contain a space, then you always get an odd number of spaces in the line. In which case I wondered about prototyping a fix which splits on the middle space in the line.

carpenterjc avatar Dec 02 '20 09:12 carpenterjc

Patching runfiles.py:179 with the following code works for me.

  @staticmethod
  def _LoadRunfiles(path):
    """Loads the runfiles manifest."""
    result = {}
    with open(path, "r") as f:
      for line in f:
        line = line.strip()
        if line:
          tokens = line.split(" ")
          if len(tokens) == 1:
            result[line] = line
          else:
            result[" ".join(tokens[0:int(len(tokens)/2)])] = " ".join(tokens[int(len(tokens)/2):])
    return result

Same algorithm should work across many languages.

carpenterjc avatar Dec 02 '20 10:12 carpenterjc

The latest version of Boost (1.75) contains a header with a space in it, which is how I was bitten by this issue. boostorg/serialization#221

HackAttack avatar Feb 09 '21 04:02 HackAttack

How is the progress on this issue? This is the only thing preventing me from considering bazel on Windows.

songyang-dev avatar Apr 22 '21 21:04 songyang-dev

Disclaimer: Just some hint/trick that might be useful as a workaround for some folks. This still needs to be fixed.

I just wanted to post what we have been doing in the majority of Bazel projects for rules_webtesting compatibility (where the Chromium darwin binaries contain a space). In short: The runfile space error only seems to occur when the runfile build links are being created. With the --nobuild_runfile_trees flag the runfile tree creation is deferred until actually needed (this had CI cache improvements with RBE for us as well).

The flag solves the problem nicely (for us at least) because the runfile trees are not actually created on Darwin where the sandbox strategy does not rely on runfile build links at all (more details on this here: https://github.com/bazelbuild/bazel/commit/03246077f948f2790a83520e7dccc2625650e6df)

# Do not build runfile forests by default. If an execution strategy relies on runfile
# forests, the forest is created on-demand. See: https://github.com/bazelbuild/bazel/issues/6627
# and https://github.com/bazelbuild/bazel/commit/03246077f948f2790a83520e7dccc2625650e6df
# Note: This also works around an issue where Bazel does not support spaces in runfiles. The
# Chromium app files for Darwin contain files with spaces and this would break. For darwin though,
# the sandbox strategy is used anyway and runfile forests are not needed.
# Related Bazel bug: https://github.com/bazelbuild/bazel/issues/4327.
build --nobuild_runfile_links

devversion avatar Sep 17 '21 22:09 devversion

I am still running into this in the very latest Basel 6.0.0 pre-release :(

WARNING: Output user root "C:/Users/Max Gerhardt/_bazel_Max Gerhardt" contains a space. This will probably break the build. You should set a different --output_user_root.
ERROR: bazel does not currently work properly from paths containing spaces (C:/users/max gerhardt/downloads/examples-main/cpp-tutorial/stage1).
Sync finished
Sync failed

My Windows username is dictated by my company, I can't change it..

maxgerhardt avatar Oct 29 '21 15:10 maxgerhardt

I am still running into this in the very latest Basel 6.0.0 pre-release :(

Hey @maxgerhardt , can you try the --output_user_root flag? Example:

bazel --output_user_root=c:/b build //foo:bar

laszlocsomor avatar Nov 03 '21 07:11 laszlocsomor

One more thing @maxgerhardt , I think this bug is more relevant: https://github.com/bazelbuild/bazel/issues/5273

Is that correct?

laszlocsomor avatar Nov 03 '21 08:11 laszlocsomor

I've just run into this issue whilst trying to use rules_webtesting. Unfortunately, neither --nobuild_runfile_links nor --experimental_inprocess_symlink_creation works (this is running on MacOS) and both generate errors like the one below (i.e. Failed to create runfiles symlinks). One other thing I noticed is that there is a difference in behaviour between bazel test and bazel run. For example, on rules_webtesting where is set:

$ bazel run --nobuild_runfile_links //javatests/com/google/testing/web:WebTestTest_chromium-local
ERROR: Error creating runfiles: Failed to create runfiles symlinks: build-runfiles failed: error executing command
  (cd /private/var/tmp/_bazel_tom/f90c16530db64ba46a3872bcb8b04839/execroot/io_bazel_rules_webtesting && \
  exec env - \
  /var/tmp/_bazel_tom/install/71ed47cad951a20fff87381f54639763/build-runfiles bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles_manifest bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles): Process exited with status 1
/var/tmp/_bazel_tom/install/71ed47cad951a20fff87381f54639763/build-runfiles (args bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles_manifest bazel-out/darwin_arm64-fastbuild/bin/javatests/com/google/testing/web/WebTestTest_chromium-local.sh.runfiles): link or target filename contains space on line 132: 'io_bazel_rules_webtesting/external/org_chromium_chromium_macos_arm64/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Chromium Framework /private/var/tmp/_bazel_tom/f90c16530db64ba46a3872bcb8b04839/external/org_chromium_chromium_macos_arm64/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Chromium Framework'

whereas:

bazel test --nobuild_runfile_links //javatests/com/google/testing/web:WebTestTest_chromium-local

works fine.

Are there any other workarounds for this issue?

tomgr avatar Nov 04 '21 11:11 tomgr