Use `helm pull` for downloading charts
Since helm is already available, use it for downloading charts using helm pull instead of attempting to duplicate helm.
This will ensure any use cases handled by helm can be handled by rules_helm.
Downloaded charts are cached in the workspace by Bazel since the repository rule isn't always re-executed (https://bazel.build/external/repo#when_is_the_implementation_function_executed)
A potential separate feature to add is pinning charts by their SHA-256 checksum, similar to https://github.com/bazel-contrib/rules_jvm_external?tab=readme-ov-file#pinning-artifacts-and-integration-with-bazels-downloader. But I think most Helm charts are small enough that I'm not concerned if the same chart has be downloaded in multiple workspaces instead of coming from the repository cache.
helm is being cached and charts are only being downloaded if an attribute of the build rule changes. Rebuilding //tests/with_chart_deps doesn't show the repository rule is always being executed.
Actually, we can return a repo_metadata object to indicate if the rule is reproducible.
Unfortunately, we can't determine the exact attributes required for reproducibility unless we parsed the output from YAML output which is returned via stderr:
$ helm show chart oci://registry-1.docker.io/bitnamicharts/grafana 2> /dev/null
annotations:
category: Analytics
images: |
- name: grafana
image: docker.io/bitnami/grafana:12.1.1-debian-12-r1
- name: os-shell
image: docker.io/bitnami/os-shell:12-debian-12-r50
licenses: Apache-2.0
tanzuCategory: application
apiVersion: v2
appVersion: 12.1.1
dependencies:
- name: common
repository: oci://registry-1.docker.io/bitnamicharts
tags:
- bitnami-common
version: 2.x.x
description: Grafana is an open source metric analytics and visualization suite for
visualizing time series data that supports various types of data sources.
home: https://bitnami.com
icon: https://dyltqmyl993wv.cloudfront.net/assets/stacks/grafana/img/grafana-stack-220x234.png
keywords:
- analytics
- monitoring
- metrics
- logs
maintainers:
- name: Broadcom, Inc. All Rights Reserved.
url: https://github.com/bitnami/charts
name: grafana
sources:
- https://github.com/bitnami/charts/tree/main/bitnami/grafana
version: 12.1.8
I filed https://github.com/helm/helm/issues/31205 for having helm pull print the pulled version, but that will require waiting for Helm v4 and I don't think would be backported to v3.
What about maintaining two options? helm_pull to allow using helm and it's resolver logic to download a chart, which wouldn't be cacheable, and helm_import_url to download a specific chart using the Bazel downloader.
What about maintaining two options?
helm_pullto allow usinghelmand it's resolver logic to download a chart, which wouldn't be cacheable, andhelm_import_urlto download a specific chart using the Bazel downloader.
If there is something that's just not possible with just repository_ctx then I would be in favor of another rule that does that. However, I don't think there should be overlap in functionality. I would want to do something that discourages the use of a helm invocation if there's a repository_ctx way to do the same thing.
A couple benefits from using real helm pull for pulling charts:
- Allows
helmto handle authentication instead of reimplementing it in Bazel rules (which is already done for OCI registries, I don't know how to handle it for HTTP repositories) - Pull by version constraint instead of a specific version. This wouldn't be reproducible, but could log the pulled version and warn to specify a specific version
- Allow verifying the provenance file (I don't have a use-case for this but maybe some other users do)
But even using repository_ctx.download doesn't always mean the result will be reproducible or cacheable. If pulling a chart from a repository or via a OCI version tag, multiple HTTP calls are required before pulling the chart package.
For example, rules_oci does allow specifying a version tag (1.2.3) instead of a digest but warns on that case.
One more issue I thought of for OCI URLs: URLs that use a version digest (@sha123..) have the SHA256 sum of the entire container, but we only care about chart blob which has its own digest that's not known until pulling the manifest (either by tag or digest)
Little demo of helm pull:
$ helm pull oci://registry-1.docker.io/bitnamicharts/grafana@sha256:93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d
Pulled: registry-1.docker.io/bitnamicharts/grafana@sha256:93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d
Digest: sha256:93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d
zachburg:/tmp/tmp.dV1XrHuMga
$ ls
grafana@sha256-93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d.tgz
zachburg:/tmp/tmp.dV1XrHuMga
$ sha256sum grafana@sha256-93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d.tgz
e7227a3b9e77777fed9d781745dafdc384982f2326d859605101396611e928ed grafana@sha256-93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d.tgz
zachburg:/tmp/tmp.dV1XrHuMga
The SHA256 sum of the chart is not related to the OCI URL.
@abrisco, thoughts on https://github.com/abrisco/rules_helm/pull/197#issuecomment-3254851548?
Looking at https://github.com/bazel-contrib/rules_jvm_external and https://github.com/bazel-contrib/rules_python, they use coursier and pip respectively, with optional support to write a file for downloading dependencies via the Bazel downloader. I think the possible equivalent here is to intercept the logs from helm pull/helm show and write a file of chart deps that could be downloaded via Bazel, if desired.
@abrisco, thoughts on #197 (comment)?
Looking at https://github.com/bazel-contrib/rules_jvm_external and https://github.com/bazel-contrib/rules_python, they use
coursierandpiprespectively, with optional support to write a file for downloading dependencies via the Bazel downloader. I think the possible equivalent here is to intercept the logs fromhelm pull/helm showand write a file of chart deps that could be downloaded via Bazel, if desired.
Hey! Sorry for the radio silence here. Things have been hectic 😓
I'm extremely familiar with rules_python and it's use of pip and that exact implementation is what solidified my stance that repository rules should ideally use repository_ctx to download and not rely on any external tools or caching mechanism. The middle ground I've found to be palatable is if an external tool was used to query the data needed to use the Bazel downloader (which is similar to the original implementation).
If there were to be a helm pull backed repository rule, my current thought is that it should always say it's not reproducible and should recommend using the pure Bazel rule.
Though maybe this is something we could better discuss on slack at this point. I don't mean to be halting progress here.
I don't have easy access to Slack, but email [email protected] instead.