Support for squashing multiple input directories in pkg_files
If you have custom rules that generates a directory with ctx.actions.declare_directory you cannot squash them using pkg_files. By squashing I mean removing top directory specified by ctx.actions.declare_directory. This works with pkg_tar but not with the new pkg_files.
strip_prefix gives error right away saying it is not supported for TreeArtifacts. renames does not support multiple destination paths that are the same. This makes sense sometimes when you want to guarantee nothing is overwritten but it would be great to have an option to allow this behavior.
If you don't need multiple directories squashed into a single place, renames should work fine (e.g. the options in the 760536837d5770f23e8520a9aa778d997712561f's commit message).
If you do need to squash multiple directories into the same location, that support is indeed absent. To clarify, is that what you're suggesting?
Yes I mean exactly that, "squash multiple directories into the same location" (sorry for not being clear). We have a lot of these things in our final "release" packaging stage. REMOVE_BASE_DIRECTORY works but can only be used once (for directories).
I've found a hack to squash two directories by setting rename value to REMOVE_BASE_DIRECTORY and "./". Since they are different in path but result in the same it works. But this is really ugly, wont work if you have more than two to squash and not something I want to use since it is really hard to understand.
Gotcha. Thanks for the additional information.
There are two pieces to implementing this:
- A change to
pkg_filesto allow for collisions with TreeArtifacts at analysis time. - A change to the package manifest backend to detect collisions at packaging time. Perhaps this should be made into a library?
Agree, that would be preferred.
If the second point is hard, a simple "allow_directory_overwrites" flag to pkg_files could suffice as well which you could toggle when you want to squash multiple directories to the same destination. This flag would have the meaning "I know the files underneath directories is unknown to Bazel and squashing such directories might result in undetected overwrites, but I don't care."
This is a slight blocker for us moving from bazel_tools's pkg_tar to the rules_pkg version: we were relying on remap_paths to squash several input tree artifacts into a base directory.
I say slight blocker because it looks like the workaround given in https://github.com/bazelbuild/rules_pkg/issues/450#issuecomment-951993171 is still possible, but that's definitely not behavior we want to assume will always work.
@nacl, @aiuto: are we in agreement that the suggested changes to allow collisions are the best path forward? I'm open to adding tests and attempting a fix for this behavior.
I'm not sure what the suggested changes are. I think the summary is
- add "allow_directory_overwrites" in pkg_*
- if two base directories map to the same place, still allow it.
- if you get an accidental file overwrite, silently allow it.
I want to propose something different.
- allow multiple tree artifacts to land at the same spot
- detect collisions as we do (or should do) today
- if the files (or directories), including attributes are exact matches, silently do the right thing
- if they differ, fail.
- disjoint tree artifacts just works.
I think that would be a behavior that just defaults to doing the right thing for the widest set of valid inputs.