containerregistry icon indicating copy to clipboard operation
containerregistry copied to clipboard

Properly handle opaque whiteout entries in layer flattener

Open JoshRosen opened this issue 6 years ago • 6 comments

This PR fixes a bug in the flattener's handling of layers with opaque whiteout entries (.wh..wh..opq files). This bug could lead to deleted files re-appearing in flattened output.

Bug symptoms

According to the OCI spec on whiteouts:

An opaque whiteout entry is a file with the name .wh..wh..opq indicating that all siblings are hidden in the lower layer.

The existing flattening logic does not take these files into account, leading to weird anomalies when flattening images where directories are deleted in lower layers and re-created in higher layers. As an example, imagine that I have a Dockerfile which looks something like this:

FROM somewhere
RUN mkdir /directory && touch /directory/old1 /directory/old2
RUN rm -r directory && mkdir /directory && touch /directory/new

Per the spec, these filesystem changes can be represented via the following tar layers:

directory/
directory/old1
directory/old2

and

directory/
directory/.wh..wh..opq
directory/new

The expected flattened layer output is

directory/new

The current flattening implementation does not respect the .wh..wh..opq file, so the flattened output incorrectly contains both the old and new files:

directory/new
directory/old1
directory/old2

Handling of opaque whiteout files seems to be a problem for other projects, too; see https://github.com/sylabs/singularity/issues/1962 and https://github.com/moby/moby/issues/34300, for example.

Solution

This PR addresses this problem by modifying the client v2.2's flattening function to properly account for opaque whiteout files. The flattener now tracks the set of directories in higher layers which have opaque whiteouts.

Testing

I added a new unit test suite client_v2_2 and wrote tests for the extract() flattening function.

JoshRosen avatar Sep 22 '18 06:09 JoshRosen

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

:memo: Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

googlebot avatar Sep 22 '18 06:09 googlebot

I consider this a work-in-progress until the following TODOs and questions are resolved:

  • [ ] Does a similar change need to be made in the older client versions, or is it okay to change only in V2.2?
  • [ ] Does this need to be feature-flagged? Is it possible that anyone was relying on the old, incorrect behavior?
  • [x] Add comments to explain the new code in extract().
  • [x] Add a simple test of file contents (to make sure that the flattener picks the topmost version of a given file).
  • [x] Is it possible for a tar to contain an opaque whiteout file for a directory but to add no new files under that directory? In this case, should the directory be present-but-empty or should it be removed? I need to see how other projects handle this corner-case. I may be able to test this behavior by simulating this case with my Docker environment and then ensuring that the flattener output agrees with my Docker output.
    • Update: yes. It appears that the right behavior is to preserve the now-empty parent directory. This is consistent with the bin/ example given at https://github.com/opencontainers/image-spec/blob/7b1e489870acb042978a3935d2fb76f8a79aff81/layer.md#opaque-whiteout.
    • I added a test for this.

JoshRosen avatar Sep 22 '18 08:09 JoshRosen

I signed it!

JoshRosen avatar Sep 27 '18 17:09 JoshRosen

CLAs look good, thanks!

googlebot avatar Sep 27 '18 17:09 googlebot

@KaylaNguyen, do you have any time to take a look at this? We ran into this bug again the other day.

kevininderhees avatar Feb 25 '19 16:02 kevininderhees

Bump on this PR

ahirreddy avatar Jul 19 '19 20:07 ahirreddy