rules_helm icon indicating copy to clipboard operation
rules_helm copied to clipboard

Use `helm pull` for downloading charts

Open zachburg opened this issue 4 months ago • 9 comments

Since helm is already available, use it for downloading charts using helm pull instead of attempting to duplicate helm.

This will ensure any use cases handled by helm can be handled by rules_helm.

Downloaded charts are cached in the workspace by Bazel since the repository rule isn't always re-executed (https://bazel.build/external/repo#when_is_the_implementation_function_executed)

A potential separate feature to add is pinning charts by their SHA-256 checksum, similar to https://github.com/bazel-contrib/rules_jvm_external?tab=readme-ov-file#pinning-artifacts-and-integration-with-bazels-downloader. But I think most Helm charts are small enough that I'm not concerned if the same chart has be downloaded in multiple workspaces instead of coming from the repository cache.

zachburg avatar Aug 27 '25 00:08 zachburg

helm is being cached and charts are only being downloaded if an attribute of the build rule changes. Rebuilding //tests/with_chart_deps doesn't show the repository rule is always being executed.

zachburg avatar Aug 27 '25 16:08 zachburg

Actually, we can return a repo_metadata object to indicate if the rule is reproducible.

Unfortunately, we can't determine the exact attributes required for reproducibility unless we parsed the output from YAML output which is returned via stderr:

$ helm show chart oci://registry-1.docker.io/bitnamicharts/grafana 2> /dev/null
annotations:
  category: Analytics
  images: |
    - name: grafana
      image: docker.io/bitnami/grafana:12.1.1-debian-12-r1
    - name: os-shell
      image: docker.io/bitnami/os-shell:12-debian-12-r50
  licenses: Apache-2.0
  tanzuCategory: application
apiVersion: v2
appVersion: 12.1.1
dependencies:
- name: common
  repository: oci://registry-1.docker.io/bitnamicharts
  tags:
  - bitnami-common
  version: 2.x.x
description: Grafana is an open source metric analytics and visualization suite for
  visualizing time series data that supports various types of data sources.
home: https://bitnami.com
icon: https://dyltqmyl993wv.cloudfront.net/assets/stacks/grafana/img/grafana-stack-220x234.png
keywords:
- analytics
- monitoring
- metrics
- logs
maintainers:
- name: Broadcom, Inc. All Rights Reserved.
  url: https://github.com/bitnami/charts
name: grafana
sources:
- https://github.com/bitnami/charts/tree/main/bitnami/grafana
version: 12.1.8

I filed https://github.com/helm/helm/issues/31205 for having helm pull print the pulled version, but that will require waiting for Helm v4 and I don't think would be backported to v3.

zachburg avatar Aug 27 '25 22:08 zachburg

What about maintaining two options? helm_pull to allow using helm and it's resolver logic to download a chart, which wouldn't be cacheable, and helm_import_url to download a specific chart using the Bazel downloader.

zachburg avatar Aug 28 '25 18:08 zachburg

What about maintaining two options? helm_pull to allow using helm and it's resolver logic to download a chart, which wouldn't be cacheable, and helm_import_url to download a specific chart using the Bazel downloader.

If there is something that's just not possible with just repository_ctx then I would be in favor of another rule that does that. However, I don't think there should be overlap in functionality. I would want to do something that discourages the use of a helm invocation if there's a repository_ctx way to do the same thing.

abrisco avatar Aug 30 '25 20:08 abrisco

A couple benefits from using real helm pull for pulling charts:

  • Allows helm to handle authentication instead of reimplementing it in Bazel rules (which is already done for OCI registries, I don't know how to handle it for HTTP repositories)
  • Pull by version constraint instead of a specific version. This wouldn't be reproducible, but could log the pulled version and warn to specify a specific version
  • Allow verifying the provenance file (I don't have a use-case for this but maybe some other users do)

But even using repository_ctx.download doesn't always mean the result will be reproducible or cacheable. If pulling a chart from a repository or via a OCI version tag, multiple HTTP calls are required before pulling the chart package.

For example, rules_oci does allow specifying a version tag (1.2.3) instead of a digest but warns on that case.

zachburg avatar Sep 04 '25 17:09 zachburg

One more issue I thought of for OCI URLs: URLs that use a version digest (@sha123..) have the SHA256 sum of the entire container, but we only care about chart blob which has its own digest that's not known until pulling the manifest (either by tag or digest)

Little demo of helm pull:

$ helm pull oci://registry-1.docker.io/bitnamicharts/grafana@sha256:93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d
Pulled: registry-1.docker.io/bitnamicharts/grafana@sha256:93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d
Digest: sha256:93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d
zachburg:/tmp/tmp.dV1XrHuMga
$ ls
grafana@sha256-93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d.tgz
zachburg:/tmp/tmp.dV1XrHuMga
$ sha256sum grafana@sha256-93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d.tgz 
e7227a3b9e77777fed9d781745dafdc384982f2326d859605101396611e928ed  grafana@sha256-93158fcd5ca9f61687bee4f4bd5e9867af4285860f794696e7771b403f46149d.tgz
zachburg:/tmp/tmp.dV1XrHuMga

The SHA256 sum of the chart is not related to the OCI URL.

zachburg avatar Sep 04 '25 20:09 zachburg

@abrisco, thoughts on https://github.com/abrisco/rules_helm/pull/197#issuecomment-3254851548?

Looking at https://github.com/bazel-contrib/rules_jvm_external and https://github.com/bazel-contrib/rules_python, they use coursier and pip respectively, with optional support to write a file for downloading dependencies via the Bazel downloader. I think the possible equivalent here is to intercept the logs from helm pull/helm show and write a file of chart deps that could be downloaded via Bazel, if desired.

zachburg avatar Oct 01 '25 22:10 zachburg

@abrisco, thoughts on #197 (comment)?

Looking at https://github.com/bazel-contrib/rules_jvm_external and https://github.com/bazel-contrib/rules_python, they use coursier and pip respectively, with optional support to write a file for downloading dependencies via the Bazel downloader. I think the possible equivalent here is to intercept the logs from helm pull/helm show and write a file of chart deps that could be downloaded via Bazel, if desired.

Hey! Sorry for the radio silence here. Things have been hectic 😓

I'm extremely familiar with rules_python and it's use of pip and that exact implementation is what solidified my stance that repository rules should ideally use repository_ctx to download and not rely on any external tools or caching mechanism. The middle ground I've found to be palatable is if an external tool was used to query the data needed to use the Bazel downloader (which is similar to the original implementation).

If there were to be a helm pull backed repository rule, my current thought is that it should always say it's not reproducible and should recommend using the pure Bazel rule.

Though maybe this is something we could better discuss on slack at this point. I don't mean to be halting progress here.

abrisco avatar Oct 14 '25 02:10 abrisco

I don't have easy access to Slack, but email [email protected] instead.

zachburg avatar Oct 14 '25 17:10 zachburg