container-diff yields different results when running command from different locations.
Expected behavior
When running
container-diff diff <path to tar file1> <path to tar file2> --type=file
multiple times from different directories (modifying paths to tar files as needed to point at the same tar files), I expected to get the same correct output summary.
Actual behavior
Changing dirs and running container-diff for the same pair of tar files produces different diff summaries.
Information
- container-diff version: v0.15.0
- Operating system: linux
Steps to reproduce the behavior
Notes before reproducing
The provided image tars have one file that is different that is
/etc/ssl/certs/java/cacerts
In the process of building the image, some files were generated and deleted. So the final image does not contain the following files:
/var/cache/fontconfig/d589a48862398ed80a3d6066f4f56f4c-le64.cache-6
/var/cache/fontconfig/7ef2298fde41cc6eeb7af42e48b7d293-le64.cache-6
/var/cache/fontconfig/3830d5c3ddfd5cd38a049b759396e72e-le64.cache-6
/var/cache/fontconfig/4c599c202bc5c08e2d34565a40eac3b2-le64.cache-6
- Copy the pair of tar files with my existing directory structure.
$ gsutil cp -r gs://container-diff-test . - Run the container-diff tool from the first location.
$ cd container-diff-test$ container-diff diff tests/contrib/rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar tests/contrib/rbe-test-xenial_repro_test_test_img2_outs/img_outs/rbe-test-xenial-with-pkgs.tar --type=fileThis produces an output saying that the non-existing files (from above) are different between the two images. Also it says nothing about the actual file that is different. - Run the container-diff tool from another location (assuming your current dir is
container-diff-test.$ cd tests/contrib$ container-diff diff rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar rbe-test-xenial_repro_test_test_img2_outs/img_outs/rbe-test-xenial-with-pkgs.tar --type=fileThis produces the correct output saying that only/etc/ssl/certs/java/cacertsis different.
Issue Update
The issue described above came from the way container-diff caches image tars being provided to the container-diff diff command.
From observations, the container-diff tool caches provided image tars as follows:
running the command
container-diff diff a/b/c/image1.tar a/b/image2.tar --type=file
crates two directories under ~/.container-diff/cache called abcimage1.tar and abimage2.tar.
Failing Scenarios
This behaviour can cause confusion and output incorrect results in different scenarios. For example:
- Having an image tar at
a/b/c/image.tarand running thecontainer-diffusing this path creates the cache entry as described above. Assume, I now overwrite that existinga/b/c/image.tarfile with another file (without changing the name). Then I want to rerun the container-diff tool, which will ignore thata/b/c/image.taris now a different file and will use what it cached previously. This is actually the scenario I ran into originally and is described in the original issue post. You are not able to reproduce it (with the instructions I provided) because you started with a clean cache (and have never overwritten tar files being compared). - Running the
container-diff diffcommand from different directories and comparing image tars with the same names can lead to incorrect results. Assume we have two image tars located at~/dir1/image_dir/image.tarand~/dir2/image_dir/image.tar. Assume, the images have differences. With a clean cache, insidedir1run:container-diff diff image_dir/image.tar ../dir2/image_dir/image.tar --type=fileThis will produce the correct result and create one cache entry/directory (not sure why not two) calledimage_dirimage.tar. I assume that this directory caches the image at~/dir1/image_dir/image.tar. Now, fromdir2compare both images again:container-diff diff image_dir/image.tar ../dir1/image_dir/image.tar --type=file. The output will now say that images are the same (and no new cache entries are created). I assume that now container-diff compares../dir1/image_dir/image.tarwhich was previously cached with the cached image (so compares the image to itself and ignores the actual imageimage_dir/image.tarthat was provided).
Suggestions
- Detect that specified image has changed, so cached image shouldn't be used.
- Make caching smarter (don't cache the same image multiple times if relative path to the same image changes when running container-diff).
- Document behaviour of container-diff caching mechanism.
- Add option to clear cache.
- Add flag to not use cache.
- Add flag to not cache provided images.
I've just been through the bug list and there are other bugs that are based on getting caching right. I closed one of them, but I'll probably link it to this later.
So, while there is a --no-cache option already, which we use heavily in our automated use of container-diff, that is not sufficient. At a minimum we need to:
- provide a clear-cache option
- provide feedback to users about when a cached image is being used and why
- create logic that checks that the source image (whether it be a local file or a remote one) has not changed
Actually, let's amend that tactic a little:
- caching logic should include first checking if image paths given are just local filepaths. If they are local, then don't do any of the caching logic.
Oh man how much do we need a better --help output??? Turns out setting -v INFO gets you lots of debug messages:
donmccasland@rocinante:~/go/src/github.com/GoogleContainerTools/container-diff/bug-314-tests/container-diff-test$ container-diff diff tests/contrib/rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar tests/contrib/rbe-test-xenial_repro_test_test_img2_outs/img_outs/rbe-test-xenial-with-pkgs.tar --type=file -v INFO
INFO[0000] starting diff on images tests/contrib/rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar and tests/contrib/rbe-test-xenial_repro_test_test_img2_outs/img_outs/rbe-test-xenial-with-pkgs.tar, using differs: [file]
INFO[0000] retrieving image: tests/contrib/rbe-test-xenial_repro_test_test_img2_outs/img_outs/rbe-test-xenial-with-pkgs.tar
INFO[0000] retrieving image: tests/contrib/rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar
INFO[0000] retrieving image ref from tar took 0.000448 seconds
INFO[0000] retrieving image ref from tar took 0.000412 seconds
INFO[0009] time elapsed retrieving image digest: 9.831305s
INFO[0009] using cached filesystem in /usr/local/google/home/donmccasland/.container-diff/cache/testscontribrbe-test-xenial_repro_test_test_img2_outsimg_outsrbe-test-xenial-with-pkgs.tar
INFO[0009] time elapsed retrieving image digest: 9.839328s
INFO[0009] using cached filesystem in /usr/local/google/home/donmccasland/.container-diff/cache/testscontribrbe-test-xenial_repro_test_test_img1_outsimg_outsrbe-test-xenial-with-pkgs.tar
INFO[0009] computing diffs
-----File-----
These entries have been added to tests/contrib/rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar: None
These entries have been deleted from tests/contrib/rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar: None
These entries have been changed between tests/contrib/rbe-test-xenial_repro_test_test_img1_outs/img_outs/rbe-test-xenial-with-pkgs.tar and tests/contrib/rbe-test-xenial_repro_test_test_img2_outs/img_outs/rbe-test-xenial-with-pkgs.tar:
FILE SIZE1 SIZE2
/etc/ssl/certs/java/cacerts 171.6K 171.6K
Okay, well in the category of "well that's yer problem right thar.." I will point to https://github.com/GoogleContainerTools/container-diff/blob/5b5e6275784c21963d9b0932f63da863fad6530a/pkg/util/image_utils.go
In particular:
func GetFileSystemForLayer(layer v1.Layer, root string, whitelist []string) error {
empty, err := DirIsEmpty(root)
if err != nil {
return err
}
if !empty {
logrus.Infof("using cached filesystem in %s", root)
return nil
}
So, just because the cache dir isn't empty doesn't mean the cache is fresh.
Thank you for your continuous investigation on this, Don. Yes, this is definitely the simplest check that could be to decide whether or not to use the cache :)