rules_docker
rules_docker copied to clipboard
Lazy container pull (aka only download image when it is really needed)
I'd like to be able to list several docker images as part of my WORKSPACE but only download the actual image if the target I'm building needs it.
Right now, if you list a container image as part of the WORKSPACE, they will be download even for bazel queries.
How hard is it to have a lazy version of container_pull?
This is a problem in my company. We have this monorepo with several large container images. Even though each individual developer spends most of his/her time targeting a specific container, eventually they need to download everything, for example, when pre submit hook performs a bazel query deps to figure out which tests are affected. This has been a growing pain.
Good news: I have a working prototype. The general idea is rather simple:
- Modified puller by adding a
-metadata-only
flag which skips layers download. - Introduced a
lazy_download
parameter tocontainer_pull
which is false by default. - When
lazy_download
is true, puller downloads only the metadata and adds a genrule to BUILD file.
Generated BUILD file looks like this:
package(default_visibility = ["//visibility:public"])
load("@io_bazel_rules_docker//container:import.bzl", "container_import")
genrule(
name = "download",
message = "Fetching",
outs = ["000.tar.gz", "001.tar.gz", "002.tar.gz", "003.tar.gz", "004.tar.gz", "005.tar.gz", "006.tar.gz", "000.sha256", "001.sha256", "002.sha256", "003.sha256", "004.sha256", "005.sha256", "006.sha256"],
cmd = "$(location @io_bazel_rules_docker//container/go/cmd/puller:puller) -directory $(RULEDIR) -os linux -os-version -os-features -architecture amd64 -variant -features -name someregistry/elasticsearch/elasticsearch:v1 -timeout 6000",
tools = ["@io_bazel_rules_docker//container/go/cmd/puller:puller"],
)
container_import(
name = "image",
config = "config.json",
layers = ["000.tar.gz", "001.tar.gz", "002.tar.gz", "003.tar.gz", "004.tar.gz", "005.tar.gz", "006.tar.gz"],
base_image_registry = "someregistry",
base_image_repository = "elasticsearch/elasticsearch",
base_image_digest = "sha256:6c36fa585104d28bba9e53c799a4e20058445476cadb3b3d3e789d3793eed10a",
tags = [""],
)
exports_files(["image.digest", "digest"])
When lazy_download
is false, here is the generated BUILD file:
package(default_visibility = ["//visibility:public"])
load("@io_bazel_rules_docker//container:import.bzl", "container_import")
container_import(
name = "image",
config = "config.json",
layers = ["000.tar.gz", "001.tar.gz", "002.tar.gz", "003.tar.gz", "004.tar.gz", "005.tar.gz", "006.tar.gz"],
base_image_registry = "someregistry",
base_image_repository = "elasticsearch/elasticsearch",
base_image_digest = "sha256:6c36fa585104d28bba9e53c799a4e20058445476cadb3b3d3e789d3793eed10a",
tags = [""],
)
exports_files(["image.digest", "digest"])
Here is the list of changed files:
container/go/cmd/puller/puller.go | 15 +++++++++++----
container/go/pkg/compat/write.go | 6 +++---
container/pull.bzl | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
3 files changed, 65 insertions(+), 16 deletions(-)
It would be a HUGE time saver and pain reliever for my development team. Would you guys consider this a useful feature worth sending a PR for?
Here is the PR https://github.com/bazelbuild/rules_docker/pull/1749
FYI @alexeagle @pcj @gravypod
Thanks for filing the issue @igorgatis and sorry for the delay getting back here. We have some new community maintainers for rules_docker (cc'd) ramping up and we should be able to start addressing open issues and PRs within a few weeks.
The major blocker right now is CI setup for e2e tests need to be fixed and they are crucial to validate puller & pusher functionality. Once e2e tests are up and running again, reviewing your PR should be unblocked.
Any news?
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!
Can we push some container_layers directly and define a func container_manifest, so that we can commit a image like git rebase
I'd like to provide a tool named 'image_rebaser', we can call it like image_rebaser base:tag target:tag layer1.tar layer2.tar ....
. this tool just push all tars as blobs, and commit a manifest. (we need copy all base image to target repository if base image blob not exists in target repository, or can not invoke the mounted).
but how to make it compatible with container_bundle or containerd_push. it's a troublesome thing.
This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!
This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"