rules_pkg
rules_pkg copied to clipboard
document that directories are flattened unless data_path, strip_prefix are set to "."
In bazel 0.4.1 (and I think 0.3.2), if you give docker_build a filegroup with a glob that contains multiple directories of files, those files will be flattened into one directory.
Here's the repo I made for the reproduction. It needed a bit of set up. (You'll need xz
installed to get the wheezy rootfs file unpacked.)
If you run bazel run //subdir:buggy_flatten_image
, I expected a /foo/quux
and a /bar/quux
file to exist in the created image but instead only a /quux
exists (verifiable with a docker run -it bazel/subdir:buggy_flatten_image
). A warning about duplicate filepaths is given but not error occurs.
The addition of data_path = "."
as in the //subdir:has_subdirs_image
target fixes this and that's very surprising.
I think it's inheriting some behavior from pkg_tar's strip_prefix maybe, but it's definitely not what I expected and finding the solution wasn't easy, either. This also makes it difficult to merge together multiple filegroups with subdirectories into one image because you have to give each of them their own docker_build (or, perhaps, pkg_tar) in order to keep them from ending up at /
.
Seems like this, or something similar, was addressed in bazelbuild/bazel#677 / https://github.com/bazelbuild/bazel/commit/0e222581e674b0f7c186c4bc40be2af680e94643 but maybe something has happened in between then and now.
(I noticed this when a build was throwing up warnings, but not errors, of duplicate filepaths being included. Maybe that should be another ticket?)
Environment info
- Operating System:
macOS 10.12 Sierra
- Bazel version (output of
bazel info release
):
release 0.4.1-homebrew
This look like a bug to me. Normally default should be "." so not strip the path.
Yeah, this same flattening-every-dir problem seems to be affecting pkg_tar. I've just added some targets that demonstrate it in pkg_tar, too.
I also added another file at bar/okay/welp
so that you can see that the welp is being dropped at the very top of the fs unless strip_prefix = "."
is included.
Changing pkg_tar to use strip_prefix = "."
by default would mean that filegroups with srcs like glob("somedir/*")
would start getting somedir
included in them. That might be too much of an API change? Not sure.
There might be a middle ground we need to hit where the default doesn't flatten all of the directories, but also doesn't include the dirs that the filegroup was made by globbing over. That'd get us closer to what people might expect, but not being able to rewrite filegroups target directories easily when creating a pkg_tar or docker_build from multiple targets would still be a bummer.
Lmk what you think.
This sounds reasonable to me. Can you ask the question on bazel-discuss@ to see what other thinks?
Wrapping it with a macro for people might be an acceptable workaround for people that complains.
Done! https://groups.google.com/forum/#!topic/bazel-discuss/LHjopuUvZks
Nobody bit on that thread! Seems like it's up to y'all.
Ok thanks, this will get done next quarter. Happy holidays!
I know I'm late to the discussion here. However, I just ran into this problem, so I thought I'd cast a vote in the event that it's not too late.
Personally, I'd prefer to see the default behavior be to preserve the directory structure below the BUILD file containing the filegroup rule, then allow options like strip_prefix
or a hypothetical flatten
to modify the paths from there (probably on the filegroup rather than on the docker rule, although maybe I'm overlooking some use case...). Following the original directory structure preserves the most information, while allowing the options provides the necessary flexibility. If the behavior change is well documented in release notes and is easy to reverse by simply applying flatten
, hopefully it won't cause too much of a problem for people.
@damienmg FWIW +1 to making the default preserve directory structure. I just got bit by this playing with some samples and it probably would have burned way more time if I didn't have a feint recollection of this issue from triaging docker_build
issues.
Just got hit by this again. Any updates?
docker_build should be fixed IIUC
Still happening at container_image
I'm no longer seeing this behavior in container_image
, at least in the cases I've looked at. I'm running 3caddbe7f75fde6afb2e2c63654b5bbeeeedf2ac of rules_docker.
(In fact, using my data_path = "."
now does a weird thing by including an extra subdirectory naned something like amd64_stripped
for Go binaries)
Oh, the data_path = "."
is doing weird things with Go binaries but the data_path
still needed on any container_image
that that a container_image
that includes a go_binary
is using as its base
.
So now we're in a really weird spot where you have to split up your container_image
calls in order to get data_path = "."
to do the workaround stuff without being too weird.
IIUC that change was something that bit go_image
too, the rules_go
changes something about their output paths in a breaking way. I don't believe that change had anything to do with data_path
.
Having been bitten by this strange default behavior multiple times myself, I'm completely on-board that this behavior is bizarre and not what users expect. :)
We should probably add a flag to restore the behavior as it exists today, in case folks are broken by it. It would be useful to have a Googler to shepherd this into the mono-repo (I don't have time right now).
Having to add strip_prefix = "."
to pkg_tar
to preserve the directory structure is really weird.
These rules, pkg_tar
and container_layer
, are pretty much broken. strip_prefix
/data_path
doesn't seem to work as advertised at all. We have two repos which are select
ed based on a config_setting
. We want to be able to preserve the directory structure of a filegroup
we are using within each repo. Let's say we have a filegroup
named tools
in each repo, and our repos are named local
and prebuilt
.
If we add the strip_prefix
/data_path
to preserve the directory structure we end up with the root of the repo in the member names of the tar file.
./../prebuilt/tools/some_script.sh
or
./../local/tools/some_script.sh
It also ignores the package_dir
attribute if strip_prefix
is set.
We even tried setting the strip_prefix
based on which repo we are selecting, but it doesn't matter. The layer gets unpacked with the repo name as the top level folder in the container we are using. It doesn't matter if the container_image
was created with container_layer
s or pkg_tar
s.
In case anyone else runs into this, I just wasted a few hrs figuring out that:
The strip_prefix="." only works if the pkg_tar target is in the top level BUILD file. If it is down a couple levels it does not work and you get a flattened tree.
You can see that the helper function _short_path_dirname special cases top level targets
From docs in rules_docker (https://github.com/bazelbuild/rules_docker#container_image)
Hint: if you want to put files in specific directories inside the image use pkg_tar
rule to create the desired directory structure and pass that to container_image via tars attribute. Note you might need to set strip_prefix = "."
or strip_prefix = "{some directory}"
in your rule for the files to not be flattened. See https://github.com/bazelbuild/bazel/issues/2176 and https://github.com/bazelbuild/rules_docker/issues/317 for more details.
@shettich , can you try with strip_prefix = "{some directory}"
and see if that works for you?
If not please let us know so that we at least can improve documentation.
I was not building a container image, just a tar.
You can see in the code that the prefix to strip off is calculated by compute_data_path() based on the output path and the strip_prefix your provide. If you provide strip_prefix="." it will use _short_path_dirname() on the output file as the prefix. That function special cases top level target paths, which is why the strip_prefix trick works for some targets but not all.
I agree with @cgoy-nvidia that pkg_tar is pretty much broken. We need a new design for how fileset's and/or other targets are combined into a tar/zip/deb/rpm, ... file. Moving this to rules_pkg so we can work on it there.
Having to add strip_prefix = "." to pkg_tar to preserve the directory structure is really weird
... it's still pretty weird in 1.0; might a PR be accepted to note this on the pkg_tar doc page, between now and when pkg_tar gets revamped?
Just ran into this - in my case I was using pkg_tar with a filegroup in a external repo. In this case I had to set strip_prefix = "./../
As the original reporter who just got bit by this again today, it would be lovely to see a fix. What direction do y'all want to take it in?
@ncal and I are tag teaming work on sensible replacements for strip prefix. #273 is the latest bit in that.
Oh, wonderful! Thank you so much for the update!
Related to #354