Optimize `copy_file`
ctx.actions.symlinkcan be used on all platforms and falls back to a copy if a symlink is unsupported. It is also heavily optimized, avoiding the need to hash the output file again.- Bazel never guarantees that an input to an action is staged as a non-symlink, so whether an output is a symlink or a hard copy only matters for top-level outputs consumed outside Bazel and handled by tools that don't follow symlinks by default, which should be extremely rare and could be worked around by explicitly setting
allow_symlinktoFalsein user-controlled code. - For file copy actions that do not go through
ctx.actions.symlink, caching is extremely cheap since the CAS entry of the input will be reused as the CAS entry of the output. Allowing remote execution and caching enables BwoB for the copy, which can avoid downloads of both the input and the output file.
The same changes are not applied to copy_directory as source directories are not officially supported by Bazel and any kind of change could cause subtle incorrectness.
Fixes #562
@ulfjack @tjgq Could you review this?
FYI @kormide, I think that most of the exec requirements are actually harmful for copy_file. Do you see any downside to dropping them as long as they are kept on copy_directory?
I wonder if it would be helpful to have a mechanism to pass in execution_requirements manually?
I wonder if it would be helpful to have a mechanism to pass in
execution_requirementsmanually?
You mean on a per-target level (per-mnemonic level is possible with --modify_execution_info)? I think that exec_properties can be used for this (with empty values).
So the idea of this PR is to
- use
ctx.actions.symlinkby default - effectively revert https://github.com/bazelbuild/bazel-skylib/pull/372 for the case of copy_file?
Seems reasonable.