rules_docker icon indicating copy to clipboard operation
rules_docker copied to clipboard

Lazy container pull (aka only download image when it is really needed)

Open igorgatis opened this issue 3 years ago • 8 comments

I'd like to be able to list several docker images as part of my WORKSPACE but only download the actual image if the target I'm building needs it.

Right now, if you list a container image as part of the WORKSPACE, they will be download even for bazel queries.

How hard is it to have a lazy version of container_pull?

This is a problem in my company. We have this monorepo with several large container images. Even though each individual developer spends most of his/her time targeting a specific container, eventually they need to download everything, for example, when pre submit hook performs a bazel query deps to figure out which tests are affected. This has been a growing pain.

igorgatis avatar Mar 06 '21 11:03 igorgatis

Good news: I have a working prototype. The general idea is rather simple:

  • Modified puller by adding a -metadata-only flag which skips layers download.
  • Introduced a lazy_download parameter to container_pull which is false by default.
  • When lazy_download is true, puller downloads only the metadata and adds a genrule to BUILD file.

Generated BUILD file looks like this:

package(default_visibility = ["//visibility:public"])
load("@io_bazel_rules_docker//container:import.bzl", "container_import")

genrule(
    name = "download",
    message = "Fetching",
    outs = ["000.tar.gz", "001.tar.gz", "002.tar.gz", "003.tar.gz", "004.tar.gz", "005.tar.gz", "006.tar.gz", "000.sha256", "001.sha256", "002.sha256", "003.sha256", "004.sha256", "005.sha256", "006.sha256"],
    cmd = "$(location @io_bazel_rules_docker//container/go/cmd/puller:puller) -directory $(RULEDIR) -os linux -os-version  -os-features  -architecture amd64 -variant  -features -name someregistry/elasticsearch/elasticsearch:v1 -timeout 6000",
    tools = ["@io_bazel_rules_docker//container/go/cmd/puller:puller"],
)

container_import(
    name = "image",
    config = "config.json",
    layers = ["000.tar.gz", "001.tar.gz", "002.tar.gz", "003.tar.gz", "004.tar.gz", "005.tar.gz", "006.tar.gz"],
    base_image_registry = "someregistry",
    base_image_repository = "elasticsearch/elasticsearch",
    base_image_digest = "sha256:6c36fa585104d28bba9e53c799a4e20058445476cadb3b3d3e789d3793eed10a",
    tags = [""],
)

exports_files(["image.digest", "digest"])

When lazy_download is false, here is the generated BUILD file:

package(default_visibility = ["//visibility:public"])
load("@io_bazel_rules_docker//container:import.bzl", "container_import")

container_import(
    name = "image",
    config = "config.json",
    layers = ["000.tar.gz", "001.tar.gz", "002.tar.gz", "003.tar.gz", "004.tar.gz", "005.tar.gz", "006.tar.gz"],
    base_image_registry = "someregistry",
    base_image_repository = "elasticsearch/elasticsearch",
    base_image_digest = "sha256:6c36fa585104d28bba9e53c799a4e20058445476cadb3b3d3e789d3793eed10a",
    tags = [""],
)

exports_files(["image.digest", "digest"])

Here is the list of changed files:

 container/go/cmd/puller/puller.go | 15 +++++++++++----
 container/go/pkg/compat/write.go  |  6 +++---
 container/pull.bzl                | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
 3 files changed, 65 insertions(+), 16 deletions(-)

It would be a HUGE time saver and pain reliever for my development team. Would you guys consider this a useful feature worth sending a PR for?

igorgatis avatar Mar 08 '21 08:03 igorgatis

Here is the PR https://github.com/bazelbuild/rules_docker/pull/1749

igorgatis avatar Mar 08 '21 17:03 igorgatis

FYI @alexeagle @pcj @gravypod

Thanks for filing the issue @igorgatis and sorry for the delay getting back here. We have some new community maintainers for rules_docker (cc'd) ramping up and we should be able to start addressing open issues and PRs within a few weeks.

The major blocker right now is CI setup for e2e tests need to be fixed and they are crucial to validate puller & pusher functionality. Once e2e tests are up and running again, reviewing your PR should be unblocked.

smukherj1 avatar Mar 16 '21 23:03 smukherj1

Any news?

igorgatis avatar Jul 13 '21 18:07 igorgatis

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!

github-actions[bot] avatar Jan 10 '22 02:01 github-actions[bot]

Can we push some container_layers directly and define a func container_manifest, so that we can commit a image like git rebase

dashjay avatar Jan 10 '22 03:01 dashjay

I'd like to provide a tool named 'image_rebaser', we can call it like image_rebaser base:tag target:tag layer1.tar layer2.tar ..... this tool just push all tars as blobs, and commit a manifest. (we need copy all base image to target repository if base image blob not exists in target repository, or can not invoke the mounted).

but how to make it compatible with container_bundle or containerd_push. it's a troublesome thing.

dashjay avatar Jan 24 '22 07:01 dashjay

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!

github-actions[bot] avatar Jul 24 '22 03:07 github-actions[bot]

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

github-actions[bot] avatar Aug 23 '22 04:08 github-actions[bot]