rules_docker icon indicating copy to clipboard operation
rules_docker copied to clipboard

Strip legacy `repositories` in config_stripper.py to ensure reproducibility of tars

Open gergelyfabian opened this issue 2 years ago • 0 comments

PR Checklist

Please check if your PR fulfills the following requirements:

  • [x] Tests for the changes have been added (for bug fixes / features)
  • [x] Docs have been added / updated (for bug fixes / features): stripping config seems to be a non-documented implementation detail. Stripping manifests is also not documented atm.

PR Type

What kind of change does this PR introduce?

  • [x] Bugfix
  • [ ] Feature
  • [ ] Code style update (formatting, local variables)
  • [ ] Refactoring (no functional changes, no api changes)
  • [ ] Build related changes
  • [ ] CI related changes
  • [ ] Documentation content changes
  • [ ] Other... Please describe:

In a certain sense this is a bugfix for the reproducibility of generated tar files when installing pkgs. Also it fixes repositories for the 1.0 Docker Image Specification.

What is the current behavior?

Currently if you install packages to a docker container, even if you ensure that all the files in the generated tar will be the same (e.g. by using the installation_cleanup_commands feature) you still won't get a reproducible tar file. The reason is that there is a repositories file in the generated tar, that is similar to the manifest.json (just representing data for a legacy docker API). manifest.json is being cleaned up by config_stripper.py, but repositories is not. As an effect, repositories in the current implementation doesn't point to an existing layer, as stripping changes the layer names, but repositories stays unchanged.

Issue Number: N/A

What is the new behavior?

Clean up also repositories basing on the stripped data from the manifest. This fixes repositories semantics and also fixes reproducibility.

Does this PR introduce a breaking change?

  • [ ] Yes
  • [x] No

Assuming repositories can be parsed in the new version by older docker implementations. In fact if it's a breaking change, then I assume we'd need a different solution (e.g. remove the legacy repositories file from the tar if some option is set).

Other information

Any alternative ideas how to fix the lack of reproducibility caused by the existing repositories file would be welcome.

gergelyfabian avatar May 24 '22 09:05 gergelyfabian