bazel-buildfarm icon indicating copy to clipboard operation
bazel-buildfarm copied to clipboard

Cache Directory Has No Write Privilege

Open lixin-wei opened this issue 3 years ago • 5 comments

My build rule needs to create some files by running custom scripts. It's fine when I'm building locally.

But when I use buildfarm to build remotely, I encountered a privilege error:

ERROR: /home/admin/.cache/bazel/_bazel_admin/65c94b62544ddd3fd6adce5fe1096d8b/external/com_github_antirez_redis/BUILD.bazel:9:8: Executing genrule @com_github_antirez_redis//:bin failed (Exit 2): bash failed: error executing command
  (cd /home/admin/.cache/bazel/_bazel_admin/65c94b62544ddd3fd6adce5fe1096d8b/execroot/com_github_ray_project_ray && \
  exec env - \
    PATH=.....\
    PYENV_SHELL=bash \
  /bin/bash -c '{....some scripts..... too long so I delete them}')
Execution platform: @local_config_platform//:host
./mkreleasehdr.sh: line 11: touch: cannot touch ‘release.h’: Permission denied

As the above log shows, I want to run mkreleasehdr.sh, but have no privilege to create a file.

I debugged for a while and found out that, the cache file in remote machine, in my case is /tmp/worker/cache, has no write privilege for all users.

[admin@ray ray.buildfarm-0.inc.alipay.net /tmp]
$ find ./ -name 'mkreleasehdr.sh'
./worker/cache/c798dd35a15dcb6ecc3debfd4edd59dc89fe695997e12e2db66d9789ce20517d_dir/com_github_antirez_redis/src/mkreleasehdr.sh

[admin@ray ray.buildfarm-0.inc.alipay.net /tmp]
$ ls -lh ./worker/cache/ | grep c798dd35a15dcb6ecc3debfd
dr-xr-xr-x   4 admin admin 4.0K Oct 21 23:00 c798dd35a15dcb6ecc3debfd4edd59dc89fe695997e12e2db66d9789ce20517d_dir
-rw-r--r--   1 admin admin  57K Oct 21 23:00 c798dd35a15dcb6ecc3debfd4edd59dc89fe695997e12e2db66d9789ce20517d_dir_inputs

In my case, my scripts is /tmp/worker/cache/c798dd35a15dcb6ecc3debfd4edd59dc89fe695997e12e2db66d9789ce20517d_dir/com_github_antirez_redis/src/mkreleasehdr.sh, and it will create some file in the working dir, that is /tmp/worker/cache/c798dd35a15dcb6ecc3debfd4edd59dc89fe695997e12e2db66d9789ce20517d_dir/com_github_antirez_redis/src/

So how to configure the privilege of these cache files? Could anyone help me?

lixin-wei avatar Oct 21 '21 15:10 lixin-wei

Could anyone have a look?

lixin-wei avatar Oct 22 '21 15:10 lixin-wei

If your action were to declare the input directory path (presumably external/com_github_antirez_redis/src) as the location of the output file that it creates named release.h, that would make it writable.

However, that script seems invasive, and the intent - to create a header containing the current release, and one that would be wholly non-deterministic (since it includes a date) - could probably be easily substituted, perhaps with a separate genrule. Since redis is not bazel-ready, I'm not sure where you're getting your build definitions from, but this may require some manual intervention to create this file as something more congenial to a bazel build.

werkt avatar Oct 23 '21 03:10 werkt

@werkt Thank you for your reply! But this action can be executed successfully in local mode.

I'm wondering why the remote worker disables the write access to the cache folder, while the local worker doesn't?

Is there any way to configure the privilege of the cache file of the remote worker, so I can work around the problem? The build rule of Redis is too complex, I don't want to dive into it...

lixin-wei avatar Oct 23 '21 06:10 lixin-wei

It is mostly not up to me what bazel allows or doesn't allow to be written locally, that is a client detail.

The write privilege on the directory in question is removed because of the nature of input directory reuse - we disallow writes because we may use those directories repeatedly across multiple actions, possibly concurrently. If you write into those directories, you will be competing with concurrent actions and affecting subsequent actions, and the directories' contents will no longer reflect that digest which was used to refer to them, since their names are, like other content entries, defined exclusively by their contents.

You can disable this directory reuse on a worker by disabling link_input_directories, but this is a global option that have major performance effects across all actions, whether well behaved or not, and is no guarantee that the directory will have write permissions out of the gate - you could simply do a chmod on the dir in this instance.

You could in this instance simply not execute this action remotely, instead using tags = ["local"] on the rule definition.

Despite not wanting to dive into this, I believe the utility of this particular script is minimal: bazel does not base its action invalidation on timestamps, so the intended effect of touching these header files and the source file referenced in the script is moot. You should use another mechanism, possibly another genrule, to generate the header contents. Are the bazel build definitions for your redis repository available publicly?

werkt avatar Oct 24 '21 04:10 werkt

You could in this instance simply not execute this action remotely, instead using tags = ["local"] on the rule definition.

Oh that's a good tag, I can work around with this. thx~

If you write into those directories, you will be competing with concurrent actions and affecting subsequent actions

How about allowing writing before the action is completed? When the action is completed, you can remove the write access.

Are the bazel build definitions for your redis repository available publicly?

Yeah, it's here: https://github.com/ray-project/ray/blob/master/bazel/BUILD.redis#L9 Bazel HTTP archive is here: https://github.com/ray-project/ray/blob/master/bazel/ray_deps_setup.bzl#L95

lixin-wei avatar Oct 25 '21 05:10 lixin-wei