layer: clarify attributes for implied directories
The image specification currently does not describe how conformant implementations should handle the case of a layer that contains "implied directories" -- entries that imply parent directories exist through their path, without those parent directories having their own entires in the archive.
As such, this behavior is currently implementation-defined and may not be consistent, even in the same implementation (e.g. moby/moby#44106).
To resolve this, we explicitly define what behavior is expected in this situation, selecting 'neutral' attributes (e.g. using the container USER's UID/GID, and using 0755 for mode, as derived from the default umask(2) of 0022).
It has been observed that it may be desirable to allow (MAY) conformant implementations to instead elect to not handle this case, and to fail with a useful error message instead. I lack the insight into the wider ecosystem to consider if this is a useful behavior, but it provides another aspect to consider when discussing this change.
Fixes #737
cc @vasiliy-ul @thaJeztah
Does this mean / would be chowned to the user, with 755 permissions? Are there implementations that do this already?
/ cannot be present as all paths in a tar archive are relative. Implementations should handle that already, though I would not be opposed to an explicit Note: that calls that out.
Hmm, I suppose I misspoke, as . will serve the same purpose even if it's not a literal /. Moby has always ignored the root directory when creating parent directories, and the new implementation that is driving this PR makes it a bit more explicit:
https://github.com/moby/moby/blob/b9921a5560e22a2744bd7f51cca0475bbe20162e/pkg/archive/archive.go#L1154-L1173
When I first saw this, I was thinking of the COPY --link feature in buildkit, where you don't necessarily know the parent folder state, but inferring it is probably a worse option (e.g. creating a user folder in /home shouldn't change the permission of /home itself).
Since we are saying layer creators SHOULD include the parent folders, and this is just an exception case, then this LGTM. I've added it to tomorrow's meeting agenda to get some more eyes on it.
Capturing from the call -
- Create an issue on runtime spec to get inputs from runtime authors
- Consider how layers shared across behaves with this proposal of implicit directories. @cpuguy83 -
Possibly need to look at the parent dir to get perms Maybe with consideration for SOURCE_DATE_EPOCH or other such configurability.
I'm still mulling over what changes to this PR should look like, but as I experiment and consider, others might benefit from links to existing implementations:
- podman/cri-o: https://github.com/containers/storage/blob/4a4d8f88a707a3118c1fa61672772a428ac7e460/pkg/archive/diff.go#L79-L93
- containerd: https://github.com/containerd/containerd/blob/bb0c3804c637d640a029a8979916d97f8dcc5a8e/archive/tar.go#L425-L483 https://github.com/containerd/containerd/blob/bb0c3804c637d640a029a8979916d97f8dcc5a8e/archive/tar_unix.go#L143-L175
- moby: https://github.com/moby/moby/blob/b9921a5560e22a2744bd7f51cca0475bbe20162e/pkg/archive/archive.go#L1154-L1173
To rephrase the shared layer concern: layers aren't attached to a single image, they can be included in multiple images. So when runtimes share layers, the filesystem permissions and ownership can vary based on the order the images are pulled.
The fix is to pick settings not based on the image, either static values (root), or imply values based on the first entry in the tar file that creates the implied directories.
umoci uses 0755 and the root UIDs, btw. We also set the ctime and mtime to the Unix epoch (to have at least some hope of producing a reproducible container) but I think no-one else does that. I don't think we should say that implementations shouldn't modify / -- a lot of container images don't have a / (or . if you prefer) entry and so you can end up with irreproducible container images if you don't allow container image tools to set / (or any other implied directory) to some fixed value.
I've spent a bit of time experimenting over the last few days, and come to some conclusions:
- As nice as supporting
SOURCE_DATE_EPOCHsounds, there's not really a way to interpret it that makes sense given the process model of existing implementations; it also has the potential to not be deterministic across runtime instance restarts. - Containerd currently recurses looking to copy metadata back down the tree when it is missing; this is rather complex, and hard to specify, and doesn't seem to offer many advantages when we are trying to define an edge case image authors should avoid.
- Picking the most neutral metadata possible (e.g.
root:root, with the Unix epoch time) seems to be the most reasonable across the board.
I've pushed up a new version that adjusts things to root:root and the Unix epoch; however I still have not considered atime and ctime as they are not mentioned elsewhere in the existing spec. Does anyone think we should consider those as well?
I'd like to hear more about the motivation behind these specific recommendations/requirements. A user might naively expect parent directories to be created with the same ownership and permissions as the file being created, and I can't immediately think of an obvious reason to not do that. A different expectation might be that the ownership percolates down from the parent of the first implied directory.
i.e. If you're creating A/B/C/D and A/B already exists then the ownership and permissions of D will be specified but C is the focus of this PR. This proposal dictates specific attributes for C. I might also suggest that C's attributes could be copied from B or from D.
however I still have not considered atime and ctime as they are not mentioned elsewhere in the existing spec. Does anyone think we should consider those as well?
atime is not practical to set on Linux (there's no interface to change it directly, you can only change the system time -- possible in a time namespace -- and hope when you touch the file it matches the time you wanted) so we should ignore it. I think ctime should be set to the same as mtime.
(I wrote this draft message last year and it got lost in my tabs. I only just got around to posting it. Sorry...)
@sparr At the moment the behaviour is completely unspecified so different image-spec implementations can produce different results. If you feel that your suggested algorithm makes more sense, then you should still want some text in the spec to describe the required behaviour.
As for your suggestion, there are a few issues that would need to be considered:
- What about xattrs and other file metadata? Should only the owner and mode be copied? What about file attributes? ACLs?
- What if
A/B/C/Dis a file, with the parent being an implied directory? Using the file mode doesn't make sense, so we have to pick a default for this case anyway. The trickle-down suggestion doesn't have this problem, but it could cause security issues if the top directory had the sticky bit set and you ended up setting the sticky bit for every implied directory. - Looking at precedent, GNU tar creates implied directories as the current user and sets the mode of directories to
0755(though I suspect this is affected by umask). In other words, for root users the behaviour matches this proposal (and the way umoci does it, btw).
My personal view is that there needs to just be a simple fallback behaviour for this edge-case -- the suggestions of copying attributes up or down just adds more complexity for something that basically no image is going to run into. No image generation tool that I'm aware of generates tar archives with implied directories (arguably it would be perfectly correct for an image-spec tool today to error out if you had an archive with implied directories).
In addition, a new MUST change like this should take into account what existing tools do -- AFAIK most tools either handle this in the "naive" way (without noticing this might be an issue, they just make implied directories) or they use similar defaults to the ones described in this PR.
Friendly ping @neersighted -- I think changing that one MUST to SHOULD is the only feedback preventing this from moving :eyes: :heart: