image-spec icon indicating copy to clipboard operation
image-spec copied to clipboard

Any chance of changing the whiteout file approach?

Open timthelion opened this issue 8 years ago • 46 comments

It seems unfortunate that a new standard should use a method which unecesarilly limits the standard. With the .wh file approach, base images can no longer contain arbitrary data. This means, for example, that you cannot have an image with example OCI image-spec data stored in it. Is there any possibility of changing this to use, for example, a white out list instead. So that the files that made a layer would be:

VERSION
layer.tar
whiteouts
json

That would mean that truely arbitrary data could be stored in the images, which would be really nice :)

timthelion avatar Apr 14 '16 21:04 timthelion

To be clear, you are saying that the .wh approach is limited? And the whiteout file is preferred?

If so, I agree.

On Thu, Apr 14, 2016, 17:16 Timothy Hobbs [email protected] wrote:

It seems unfortunate that a new standard should use a method which unecesarilly limits the standard. With the .wh file approach, base images can no longer contain arbitrary data. This means, for example, that you cannot have an image with example OCI image-spec data stored in it. Is there any possibility of changing this to use, for example, a white out list instead. So that the files that made a layer would be:

VERSION layer.tar whiteouts json

That would mean that truely arbitrary data could be stored in the images, which would be really nice :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/opencontainers/image-spec/issues/24

vbatts avatar Apr 14 '16 21:04 vbatts

Creating a list of files to be whited out, and keeping that outside of layer.tar is better than putting .wh files in layer.tar. We wouldn't want a situation like this: http://git-annex.branchable.com/forum/Storing_git_repos_in_git-annex/

timthelion avatar Apr 14 '16 21:04 timthelion

I wholly agree and intend to see it done as a whiteout file list.

On Thu, Apr 14, 2016, 17:26 Timothy Hobbs [email protected] wrote:

Creating a list of files to be whited out, and keeping that outside of layer.tar is better than putting .wh files in layer.tar. We wouldn't want a situation like this: http://git-annex.branchable.com/forum/Storing_git_repos_in_git-annex/

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/opencontainers/image-spec/issues/24#issuecomment-210154542

vbatts avatar Apr 14 '16 21:04 vbatts

I think this is something we should think about for post v1.0.0. I whole heartedly agree but in the ideal case the initial v1 serialization spec is binary compatible with the existing docker serialization to save the sanity of all of the existing registry folks like acr, gcr.io, Quay, Hub, etc.

philips avatar Apr 14 '16 22:04 philips

Would it be possible to use the whiteout list file IFF a whiteout list file exists, and otherwise use the .wh approach?

timthelion avatar Apr 14 '16 22:04 timthelion

@timthelion Yes, that would be the right way of approaching it. It would be a schema bump which would be a version break. Happy to consider adding this feature to fix the issue but I do want to hold off until we get post v1.0.0 (in a couple of months).

philips avatar Apr 15 '16 00:04 philips

@philips so basically, you want to have the 1.0 release be supported by Docker/CoreOS/whatever, without actually haven't to change Docker/CoreOS/whatever's code? Basically, there will be no actual technical changes to the spec before 1.0?

timthelion avatar Apr 15 '16 08:04 timthelion

@timthelion No breaking technical changes.

cyphar avatar Jul 20 '16 20:07 cyphar

On Thu, Apr 14, 2016 at 02:26:16PM -0700, Timothy Hobbs wrote:

Creating a list of files to be whited out, and keeping that outside of layer.tar is better than putting .wh files in layer.tar.

If we don't mind picking up a non-standard tar entry, star (Schilling's tar) uses pax extension headers and defines SCHILY.filetype with (among other things) a "whiteout" value representing a BSD whiteout directory entry 1. And as far as I can tell, that's the same sort of whiteout we're interested in. So if we want a way to represent whiteouts without leaving tar or restricting the legal filename space, that's probably a good choice.

wking avatar Sep 29 '16 03:09 wking

Now that v1.0.0-rc1 is out, it's the last chance to consider this before v1.0.0 is final. After that, this will be such an incompatible change that it will have to wait for v2.0.0 (assuming SemVer).

aecolley avatar Oct 16 '16 01:10 aecolley

You could add a new whiteout approach in 1.1. You'd only need to go to 2.0 if you remove or make backward-incompatible changes to an existing approach.

wking avatar Oct 16 '16 02:10 wking

A backwards-compatible change means either that a 1.1 image must be processed correctly by a 1.0 extractor, or that a 1.0 image must be processed correctly by a 1.1 extractor; depending on your point of view. In the case of 1.1-image-on-1.0-extractor, the extractor will not know about any way to produce a file named .wh.foo, regardless of how 1.1 decides to represent it. In the case of 1.0-image-on-1.1-extractor, the image cannot contain a layer with any .wh. file, because the 1.0 spec states unambiguously that there is no representation which can produce such a file.

Either way, it seems to me that it's impossible to construct an image which extracts a .wh.foo file into the unpacked bundle, unless either (a) both image and extractor are version 1.1, which is not backwards-compatible by definition; or (b) the image produces different bundle contents in 1.0 extractors and 1.1 extractors; which is an incompatibility all on its own.

Perhaps there's something I'm missing. Perhaps these limitations are acceptable to the project members. Otherwise, it's something that should be addressed before v1.0.0, IMHO.

aecolley avatar Oct 17 '16 22:10 aecolley

Interesting. There is an option of approaching whiteouts like overlayfs does, by setting device to 0 on non-directories and and xattr for directory.

On Mon, Oct 17, 2016, 18:37 Adrian Colley [email protected] wrote:

A backwards-compatible change means either that a 1.1 image must be processed correctly by a 1.0 extractor, or that a 1.0 image must be processed correctly by a 1.1 extractor; depending on your point of view. In the case of 1.1-image-on-1.0-extractor, the extractor will not know about any way to produce a file named .wh.foo, regardless of how 1.1 decides to represent it. In the case of 1.0-image-on-1.1-extractor, the image cannot contain a layer with any .wh. file, because the 1.0 spec states unambiguously that there is no representation which can produce such a file.

Either way, it seems to me that it's impossible to construct an image which extracts a .wh.foo file into the unpacked bundle, unless either (a) both image and extractor are version 1.1, which is not backwards-compatible by definition; or (b) the image produces different bundle contents in 1.0 extractors and 1.1 extractors; which is an incompatibility all on its own.

Perhaps there's something I'm missing. Perhaps these limitations are acceptable to the project members. Otherwise, it's something that should be addressed before v1.0.0, IMHO.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/opencontainers/image-spec/issues/24#issuecomment-254353854, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEF6fCsS60XgtwhoZjJ5FVy8Go98VvXks5q0_cagaJpZM4IHyeg .

vbatts avatar Oct 17 '16 22:10 vbatts

Unfortunately, support for extended attributes varies widely and incompatibly among tar implementations. And that gets us into the tarpit of the #342 discussion (pun accidental, I swear). But it works if the chosen tar format supports it.

aecolley avatar Oct 17 '16 23:10 aecolley

On Mon, Oct 17, 2016 at 03:30:51PM -0700, Adrian Colley wrote:

A backwards-compatible change means either that a 1.1 image must be processed correctly by a 1.0 extractor, or that a 1.0 image must be processed correctly by a 1.1 extractor; depending on your point of view.

This should be made very clear in the spec, leaving it open to interpretation is going to make make SemVer largely useless. In runtime-spec, opencontainers/runtime-spec#523 made it clear that you may need a new runtime after bumping your config's minor version.

In the case of 1.1-image-on-1.0-extractor, the extractor will not know about any way to produce a file named .wh.foo, regardless of how 1.1 decides to represent it.

Agreed, which is why I want a 1.0 extractor to error out if you give it a 1.1 image.

In the case of 1.0-image-on-1.1-extractor, the image cannot contain a layer with any .wh. file, because the 1.0 spec states unambiguously that there is no representation which can produce such a file.

Well, you can have ‘.wh.foo’ entries. They're just interpreted as “remove foo” and not “please create a file (or directory, or …) at .wh.foo”. If 1.1 declares a new media type (e.g. application/vnd.oci.image.layer.v1.1.tar) supporting SCHILY.filetype whiteout 1 or device 0 2 or some other way to avoid the current path overloading, then 1.0 extractors will correctly die (with “I've never heard of application/vnd.oci.image.layer.v1.1.tar”) and 1.1 and later 1.x extractors will correctly unpack the layer (including any .wh.foo files it contains).

Either way, it seems to me that it's impossible to construct an image which extracts a .wh.foo file into the unpacked bundle, unless either (a) both image and extractor are version 1.1, which is not backwards-compatible by definition; or (b) the image produces different bundle contents in 1.0 extractors and 1.1 extractors; which is an incompatibility all on its own.

Backwards-compat means “the old stuff still works with the new tools”. So 1.0 images would still work with 1.1 tools. But if .wh.foo files are impossible in 1.0 (which seems like the current path), then yeah, no 1.0 images are going to have them. But if you want a .wh.foo file, and are willing to create a 1.1 image and use a 1.1+ extractor, that will work. You don't need to bump to 2.0 (and throw out all of your other 1.0 images).

wking avatar Oct 17 '16 23:10 wking

@wking OK, I see your point. 1.0 images can work on 1.x extractors so long as they don't attempt to create non-whiteout .wh. files, which is fine. I withdraw my position.

aecolley avatar Oct 17 '16 23:10 aecolley

After spending some more intimate time with whiteout files (.wh.), the approach currently employed in this specification is really as good as any.

What good does changing this provide? I don't see how this allows layers to have "arbitrary data".

stevvooe avatar Oct 19 '16 23:10 stevvooe

On Wed, Oct 19, 2016 at 04:47:27PM -0700, Stephen Day wrote:

After spending some more intimate time with whiteout files (.wh.), the approach currently employed in this specification is really as good as any.

What good does changing this provide? I don't see how this allows layers to have "arbitrary data".

As @timthelion describes in the topic post 1, the current approach makes it impossible to distribute .wh.* files because the entry path is overloading “unpack me to here” and “delete the stuff there”. The alternatives discussed here:

a. An external ‘whiteouts’ 1. b. SCHILY.filetype set to whiteout 2. c. Device set to 0 3.

all either give us an non-overloaded place to store the “delete the stuff there” bit (a and b) or pick a location where the overloading is less restrictive (c).

wking avatar Oct 19 '16 23:10 wking

@wking Is there a realistic use case for distributing .wh. files, other than packing up a container runtime into an image?

Please, for the love of god, stop doing this.

stevvooe avatar Oct 20 '16 00:10 stevvooe

On Wed, Oct 19, 2016 at 05:53:25PM -0700, Stephen Day wrote:

@wking Is there a realistic use case for distributing .wh. files, other than packing up a container runtime into an image?

I don't have a personal use case for it, but I would like to avoid complication when explaining what folks can put into layers. And from an implementation perspective both SCHILY.filetype set to whiteout and device set to 0 would be very easy to implement in code that already uses the .wh.* approach.

wking avatar Oct 20 '16 03:10 wking

@wking I don't really get this: the proposed layout is not how images are laid out. Such a provision requires either having in-band whiteout or a container format, which complicates unpacking. A tar of a tar, while we do it in image layout, should be avoided.

If we use overlay style device, now do you implement devices with windows on NTFS? Sure, you can put them in the tar file, but what happens when they are unpacked? How do you encode these in other archive formats that don't have device support, like zip?

stevvooe avatar Oct 21 '16 18:10 stevvooe

On Fri, Oct 21, 2016 at 11:40:17AM -0700, Stephen Day wrote:

A tar of a tar, while we do it in image layout, should be avoided.

I agree, which is why I prefer the SCHILY.filetype whiteout approach or the device 0 approach. Both of those are in-band (like the current .wh.* approach), but SCHILY.filetype is not overloaded at all and device 0 is a less-restrictive overload.

If we use overlay style device…

I think you're talking about @vbatt's device 0 approach here.

… now do you implement devices with windows on NTFS? Sure, you can put them in the tar file, but what happens when they are unpacked?

If we land “device set to zero means whiteout” docs (for application/vnd.oci.image.layer.v1.tar or application/vnd.oci.image.layer.v1.1.tar), then an unpacker handling such a tarball will invoke the whiteout operation whenever it hits a device 0 entry. I don't see how the OS comes into it, since whiteouts are a cross-platform idea. Windows unpackers can still fail if they encounter a device 1 entry, etc.

The SCHILY.filetype-set-to-whiteout approach would also be cross-platform.

How do you encode these in other archive formats that don't have device support, like zip?

You don't, but that's not a big deal. We don't have an application/vnd.oci.image.layer.v1.zip format now, and I don't hear anyone calling for one. If there is a future need for zip-based layers, the authors of the zip-layer spec will need to figure out a scheme for marking whiteouts. Maybe they'll use .wh.*, and maybe they'll use something else, but I don't think the potential presence of a future zip-based format is a good reason to overload the path as a whiteout marker in tar.

wking avatar Oct 21 '16 22:10 wking

I don't know what the best way to change the spec is, but I personally think that it would include adding a directory to the tarbal with a whiteout list and any other information that we might want to add in the future, so as to make the spec extendable.

That is that the / directory of the layer should include a /opencontainer-data directory.

I have several use-cases in mind.

The first use-case is that of profesor Dr. Janette Lang in the year 2033. Dr. Lang is one of the smartest ecologists in the world, and is currently working on a simulation of wetland health to be run on the worlds fastest supercomputer. Dr. Lang is not even aware that the simulation will be deployed using Docker because dev-ops does the deploymen. Dev-ops doesn't really think to hard about their choice of Docker becasue Docker is completely transparent for Dr. Lang's usecase, as the simulation does not need devices or even network access.

As the simulation is extremely large and runs for a long time, the simulation is occassionally serialized to files with names that look like:

.wh.<wetland-name>.yyyy.MM.dd.hh.mm.ss

These serializations are associated with user editable meta-data files named <wetland-name>.yml.

It takes several weeks for the simlulation to complete. At some point during this time, dev-ops is running an upgrade. They have to migrate the containers, and so they shut everything down, run a docker commit, and then launch everything agian. No problem, Dr. Lang's team included restarting the simulation right in their automated test suit. But this time it doesn't work. The entire simulation starts over again at 0%. A rutine procedure turns into a night filled with long distance phone calls and dev-ops sweating their beards out trying to figure out why the hell things aren't working. No one has a clue what is going on or why everything didn't go as planned, and everyone is blaming everyone else, no one even thinks to blame Docker, just as they don't think to blame BTRFs or the kernal, or the physical storage system or any of the other fundamental system utilities that they are using.

Their final solution is to let the simulation run its full course all over again, with stern order's from Dr. Lang to the dev-ops team that they are "Not to fuck with anything." And a tired and anxoious dev-ops team spends their morning cursing over coffee about how these damned scientists can't learn to program properly and how they assume that the serialization file was never saved in the first place because its "Just not there".


The second use-case is that I want to build an image within an image.


The third use-case is that of Enrico Xavier who is a spanish student of computer science and a great fan of Esparanto, the language which was specifically designed to not have any exceptions with regard to verb conjugation. He hears his lecterur say that you can put any file into an image except one which starts with .wh. and Enrico Xavier correctly assumes that it's stupid to waste your time trying to memorize random exceptions and therefor decides never to use containers or images.


The forth usecase also comes from Dr. Lang's faculty and this time is quite nefarious. Dr. Lang allows students to submit two tarball's each, one containing a set of wetland simulation config files, and a second containing some raw data from real wetlands. She instructs Dev-ops that they should extract these two tarballs into the directory containing the simulator script: wetland-simulator.py. Dev-ops does so, ensuring that the proper flags to prevent tar-bombs are set. If the tarbals happened to be nefarious and attempted to over-write the simulator script file then the tar extraction command would simply error out. They enter the tar extraction commands into their Dockerfile with one RUN command per line. Thus new layers are created for each command. However, the first tarbal does NOT contian wetland-simulator.py. It contains .wh.wetland-simulator.py. The second tarbal does contain a file wetland-simulator.py, however, the students have created a modified version, which allows them to take controll of the super computer and mine bitcoins on it while the simulation is running. The tar extraction command never errors out, as no files are overwritten. No one is any the wiser, and the students are all the richer.

timthelion avatar Oct 22 '16 10:10 timthelion

PS: With the forth example, it is of course reasonable to ask "Why did Dr. Lang decide to put untrusted test data, and an executable into a single directory. That's just bad practice!" And it is also, of course, reasonable to ask "Why would any storage system restrict file names? That's just bad practice. We already went throug that **** with FAT16 and have decided that restricting the content of file names is just a royal pain in the arse and distinctly archaic."

timthelion avatar Oct 22 '16 11:10 timthelion

While I think @timthelion is being a bit melodramatic with his examples, I agree with his general point.

Why are we treating a rejected filesystem's serialisation format (AUFS) as the "right way of doing things in a standard"? Standards survive for a very long time in general, and I don't think it's a great idea to enshrine legacy parts of Docker (the original image code was just based on how AUFS did things because AUFS was the only real union filesystem at the time) into this particular part of the standard. IMO even something as primitive as a simple file list stored in each layer with whiteout file paths would be a much better way of handling things. A better solution is to use metadata within the tar archive (such as the SCHILY.filetype or device=0 approaches) to tag a particular path as being whited-out -- thus making sure that the data about whiteouts is never separated from the layer data. We also get to avoid a bunch of ambiguity and double-parsing in the spec if we make white-outs a metadata tag for a path rather than a different kind of path.

There is no reason IMO to be artificially limiting what the filenames inside container images can be. As @timthelion said, we should have learned our lesson about restrictive filenames from the old days of filesystems. Let's not repeat history's mistakes.

cyphar avatar Oct 22 '16 16:10 cyphar

I think we should definitely pursue this, but as previously discussed it's more likely for a 1.1+ horizon.

jonboulle avatar Oct 24 '16 13:10 jonboulle

I don't know what the best way to change the spec is, but I personally think that it would include adding a directory to the tarbal with a whiteout list and any other information that we might want to add in the future, so as to make the spec extendable.

That is that the / directory of the layer should include a /opencontainer-data directory.

I have several use-cases in mind.

The first use-case is that of profesor Dr. Janette Lang in the year 2033. Dr. Lang is one of the smartest ecologists in the world, and is currently working on a simulation of wetland health to be run on the worlds fastest supercomputer. Dr. Lang is not even aware that the simulation will be deployed using Docker, because dev-ops does the deployment and dev-ops doesn't think anything of the idea, becasue Docker is completely transparent for Dr. Lang's usecase, not needing devices or even network access.

As the simulation is extremely large and runs for a long time, the simulation is occassionally serialized to files with names that look like:

.wh.<wetland-name>.yyyy.MM.dd.hh.mm.ss

These serializations are associated with user editable meta-data files named <wetland-name>.yml>.

It takes several weeks for the simlulation to complete. Sometime during this time, dev-ops is running an upgrade. They have to migrate the containers, and so they shut everything down, run a docker commit, and then launch everything agian. No problem, Dr. Lang's team included restarting the simulation right in their automated test suit. But this time it doesn't work. The entire simulation starts over again at 0%. All of a sudden, you have a night filled with long distance phone calls and dev-ops seating their beards out trying to figure out why the hell things aren't working. No one has a clue what is going on or why everything didn't go as planned, and everyone is blaming everyone else, no one even thinks to blame Docker, just as they don't think to blame BTRFs or the kernal, or any of the other fundamental system utilities that they are using.

Their final solution is to let the simulation run its full course all over again, with stern order's from Dr. Lang to the dev-ops team that they are "Not to fuck with anything." And a tired and anxoious dev-ops team cursing over coffee about how these damned scientists can't learn to program properly and how they assume that the serialization file was never saved in the first place because it's "Just not there".


The second use-case is that I want to build an image within an image.


The third use-case is that of Enrico Xavier who is a spanish student of computer science and a great fan of Esparanto. The language which was specifically designed to not have any exceptions with regard to verb conjugation. He hears his lecterur say that you can put any file into an image except one which starts with .wh.. And Enrico Xavier correctly assumes that it's stupid to waste your time trying to memorize random exceptions and therefor decides never to use containers or images.

On 10/21/2016 08:40 PM, Stephen Day wrote:

@wking https://github.com/wking I don't really get this: the proposed layout is not how images are laid out. Such a provision requires either having in-band whiteout or a container format, which complicates unpacking. A tar of a tar, while we do it in image layout, should be avoided.

If we use overlay style device, now do you implement devices with windows on NTFS? Sure, you can put them in the tar file, but what happens when they are unpacked? How do you encode these in other archive formats that don't have device support, like zip?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opencontainers/image-spec/issues/24#issuecomment-255439497, or mute the thread https://github.com/notifications/unsubscribe-auth/ABU7-JWRbm15jVCDWDh7awRc2sNj76nqks5q2QcTgaJpZM4IHyeg.

timthelion avatar Oct 25 '16 18:10 timthelion

On Tue, Oct 25, 2016 at 11:12:15AM -0700, Timothy Hobbs wrote:

I don't know what the best way to change the spec is, but I personally think that it would include adding a directory to the tarbal with a whiteout list and any other information that we might want to add in the future, so as to make the spec extendable.

You can extend at any time by minting new media types. You don't have to add support for something like this now on the off-chance that we'll use it later.

That is that the / directory of the layer should include a /opencontainer-data directory.

This has the same path-restriction as the current .wh.* approach, although it limits the restriction to a single path. POSIX pax extended headers provide the same functionality without overloading the entry path. And SCHILY.filetype is one example of what you can do with those extended headers.

wking avatar Oct 25 '16 18:10 wking

Excelent. I of course don't want to restrict the possible paths at all, and did not know that a better method was possible. Somehow, I did not understand from your discussion that this SCHILY thing is a method of hiding files inside the TAR that remain outside of the file tree.

On 10/25/2016 08:20 PM, W. Trevor King wrote:

On Tue, Oct 25, 2016 at 11:12:15AM -0700, Timothy Hobbs wrote:

I don't know what the best way to change the spec is, but I
personally think that it would include adding a directory to the
tarbal with a whiteout list and any other information that we
might want to add in the future, so as to make the spec extendable.

You can extend at any time by minting new media types. You don't have to add support for something like this now on the off-chance that we'll use it later.

That is that the |/| directory of the layer should include a
|/opencontainer-data| directory.

This has the same path-restriction as the current |.wh.*| approach, although it limits the restriction to a single path. POSIX pax extended http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02 headers http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_03 provide the same functionality without overloading the entry path. And |SCHILY.filetype| http://cdrtools.sourceforge.net/private/man/star/star.4.html is one example of what you can do with those extended headers.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/opencontainers/image-spec/issues/24#issuecomment-256121321, or mute the thread https://github.com/notifications/unsubscribe-auth/ABU7-MQwXRKptr18oOlfACQltXiCI2vrks5q3khzgaJpZM4IHyeg.

timthelion avatar Oct 25 '16 18:10 timthelion

@timthelion Besides the contrived examples, do you have an extant proof that someone has actually run into a naming collision with whiteout files? I think there are problems in nested container scenarios but that can be solved with filesystem passthrough.

Look, I am not saying that AUFS whiteouts are ideal, but I thought the goal here was to define a container standard based on working systems. Especially, one that people actually use.

If we always focus on the limitations, we'll never realize the benefits.

stevvooe avatar Oct 25 '16 19:10 stevvooe