helm-diff icon indicating copy to clipboard operation
helm-diff copied to clipboard

v3.12.2 breaks with "ensure CRDs are installed first" with `--disable-validation`

Open z0rc opened this issue 6 months ago • 4 comments

Helmfile version 1.1.2.

After upgrading helm-diff from v3.12.1 to v3.12.2, I started to see errors running helmfile apply on empty cluster. My helmfile.yaml.gotmpl has number of releases where one release installs CRD (as example using https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-operator-crds) and second release deployes CR. Second release needs first release and also has disableValidationOnInstall: true.

My env vars:

% env | grep HELM
HELM_DIFF_COLOR=true
HELM_DIFF_NORMALIZE_MANIFESTS=true
HELM_DIFF_OUTPUT_CONTEXT=5
HELM_DIFF_THREE_WAY_MERGE=true
HELM_DIFF_USE_UPGRADE_DRY_RUN=true

I tried to change HELM_DIFF_USE_UPGRADE_DRY_RUN=false, but it didn't have an effect on the error. https://github.com/databus23/helm-diff/issues/795 might be related.

Error:

PATH:
  /opt/homebrew/bin/helm

ARGS:
  0: helm (4 bytes)
  1: --kube-context (14 bytes)
  2: poc-01 (6 bytes)
  3: diff (4 bytes)
  4: upgrade (7 bytes)
  5: --allow-unreleased (18 bytes)
  6: node-local-dns (14 bytes)
  7: deliveryhero/node-local-dns (27 bytes)
  8: --version (9 bytes)
  9: 2.1.8 (5 bytes)
  10: --disable-validation (20 bytes)
  11: --kube-context (14 bytes)
  12: poc-01 (6 bytes)
  13: --namespace (11 bytes)
  14: kube-system (11 bytes)
  15: --values (8 bytes)
  16: /var/folders/r0/96rf0nss61v_4twdyp_k4zf40000gn/T/helmfile248792395/kube-system-node-local-dns-values-699c4c4fb7 (111 bytes)
  17: --reset-values (14 bytes)
  18: --detailed-exitcode (19 bytes)
  19: --color (7 bytes)

ERROR:
  exit status 1

EXIT STATUS
  1

STDERR:
  Enabled three way merge via the envvar
  Enabled normalize manifests via the envvar
  Error: unable to generate manifests: unable to build kubernetes objects from new release manifest: resource mapping not found for name: "node-local-dns" namespace: "kube-system" from "": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
  ensure CRDs are installed first
  Error: plugin "diff" exited with error

COMBINED OUTPUT:
  Enabled three way merge via the envvar
  Enabled normalize manifests via the envvar
  Error: unable to generate manifests: unable to build kubernetes objects from new release manifest: resource mapping not found for name: "node-local-dns" namespace: "kube-system" from "": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
  ensure CRDs are installed first
  Error: plugin "diff" exited with error
err 11: command "/opt/homebrew/bin/helm" exited with non-zero status:

z0rc avatar Jun 23 '25 14:06 z0rc

Here is sample helmfile.yaml.gotmpl for reproduction:

---
repositories:
- name: prometheus-community
  url: https://prometheus-community.github.io/helm-charts
- name: deliveryhero
  url: https://raw.githubusercontent.com/deliveryhero/helm-charts/refs/heads/master/

releases:
- name: prometheus-operator-crds
  namespace: monitoring
  chart: prometheus-community/prometheus-operator-crds
  version: 21.0.0
  suppressDiff: true
- name: node-local-dns
  namespace: kube-system  # it must be in kube-system as it depends on kube-dns service in that namespace
  chart: deliveryhero/node-local-dns
  version: 2.1.8
  disableValidationOnInstall: true
  needs:
  - monitoring/prometheus-operator-crds
  values:
  - config:
      dnsServer: 8.8.8.8  # put a real ip kube-dns service here
      bindIp: true
      commProtocol: prefer_udp
      prefetch:
        enabled: true
    serviceMonitor:
      enabled: true

With helm-diff v.3.12.1 the helmfile apply is successful:

% helmfile apply
Adding repo prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories

Adding repo deliveryhero https://raw.githubusercontent.com/deliveryhero/helm-charts/refs/heads/master/
"deliveryhero" has been added to your repositories

Listing releases matching ^node-local-dns$
Comparing release=prometheus-operator-crds, chart=prometheus-community/prometheus-operator-crds, namespace=monitoring

Comparing release=node-local-dns, chart=deliveryhero/node-local-dns, namespace=kube-system
Upgrading release=prometheus-operator-crds, chart=prometheus-community/prometheus-operator-crds, namespace=monitoring
Release "prometheus-operator-crds" does not exist. Installing it now.
NAME: prometheus-operator-crds
LAST DEPLOYED: Tue Jun 24 12:52:25 2025
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
TEST SUITE: None

Listing releases matching ^prometheus-operator-crds$
prometheus-operator-crds        monitoring      1               2025-06-24 12:52:25.345657 +0300 EEST   deployed        prometheus-operator-crds-21.0.0 v0.83.0

Upgrading release=node-local-dns, chart=deliveryhero/node-local-dns, namespace=kube-system
Release "node-local-dns" does not exist. Installing it now.
NAME: node-local-dns
LAST DEPLOYED: Tue Jun 24 12:52:37 2025
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None

Listing releases matching ^node-local-dns$
node-local-dns  kube-system     1               2025-06-24 12:52:37.5851 +0300 EEST     deployed        node-local-dns-2.1.8    1.26.4


UPDATED RELEASES:
NAME                       NAMESPACE     CHART                                           VERSION   DURATION
prometheus-operator-crds   monitoring    prometheus-community/prometheus-operator-crds   21.0.0         12s
node-local-dns             kube-system   deliveryhero/node-local-dns                     2.1.8           8s

With helm-diff v.3.12.2 the helmfile apply fails:

% helmfile apply
Adding repo prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories

Adding repo deliveryhero https://raw.githubusercontent.com/deliveryhero/helm-charts/refs/heads/master/
"deliveryhero" has been added to your repositories

Listing releases matching ^node-local-dns$
Comparing release=prometheus-operator-crds, chart=prometheus-community/prometheus-operator-crds, namespace=monitoring

Comparing release=node-local-dns, chart=deliveryhero/node-local-dns, namespace=kube-system
in ./helmfile.yaml.gotmpl: command "/opt/homebrew/bin/helm" exited with non-zero status:

PATH:
  /opt/homebrew/bin/helm

ARGS:
  0: helm (4 bytes)
  1: diff (4 bytes)
  2: upgrade (7 bytes)
  3: --allow-unreleased (18 bytes)
  4: node-local-dns (14 bytes)
  5: deliveryhero/node-local-dns (27 bytes)
  6: --version (9 bytes)
  7: 2.1.8 (5 bytes)
  8: --disable-validation (20 bytes)
  9: --namespace (11 bytes)
  10: kube-system (11 bytes)
  11: --values (8 bytes)
  12: /var/folders/r0/96rf0nss61v_4twdyp_k4zf40000gn/T/helmfile1347727151/kube-system-node-local-dns-values-666c444b6b (112 bytes)
  13: --reset-values (14 bytes)
  14: --detailed-exitcode (19 bytes)
  15: --color (7 bytes)

ERROR:
  exit status 1

EXIT STATUS
  1

STDERR:
  Enabled three way merge via the envvar
  Enabled normalize manifests via the envvar
  Error: Failed to render chart: exit status 1: Error: unable to build kubernetes objects from release manifest: resource mapping not found for name: "node-local-dns" namespace: "kube-system" from "": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
  ensure CRDs are installed first
  Error: plugin "diff" exited with error

COMBINED OUTPUT:
  Enabled three way merge via the envvar
  Enabled normalize manifests via the envvar
  Error: Failed to render chart: exit status 1: Error: unable to build kubernetes objects from release manifest: resource mapping not found for name: "node-local-dns" namespace: "kube-system" from "": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"
  ensure CRDs are installed first
  Error: plugin "diff" exited with error

z0rc avatar Jun 24 '25 09:06 z0rc

@z0rc got it.

yxxhero avatar Jun 24 '25 22:06 yxxhero

@z0rc could you post more logs by adding --debug flag?

yxxhero avatar Jun 24 '25 23:06 yxxhero

@yxxhero I'm attaching file with output of helmfile --debug apply for helm-diff v3.12.2 (one with the bug). Keep in mind it's 70K lines, because debug output contains full prometheus-operator crd definition.

helmfile_debug.txt

z0rc avatar Jun 25 '25 15:06 z0rc

@z0rc could you post the v3.12.1 logs?

yxxhero avatar Jul 05 '25 09:07 yxxhero

@yxxhero here are logs of helmfile --debug apply with helm-diff v3.12.1.

helmfile_debug_3.12.1.txt

z0rc avatar Jul 07 '25 14:07 z0rc

@z0rc I konw the issue. there is a fun thing, the exit code of go panic is 2, but exit code 2 is an specical exit code for helm-diff and helmfile. I will try to find a better answer to fix the issue.

yxxhero avatar Jul 08 '25 01:07 yxxhero

@z0rc dru-run mode can set to client or none.

yxxhero avatar Jul 08 '25 10:07 yxxhero

@yxxhero please elaborate. I have env var set HELM_DIFF_USE_UPGRADE_DRY_RUN=true, which is same as "client" https://github.com/databus23/helm-diff/blob/a1945d55b847e404a403f182b20728902655b2c6/cmd/upgrade.go#L73

Setting it to none or false is not what I want to.

z0rc avatar Jul 08 '25 10:07 z0rc

@z0rc --dry-run=server I saw the log from yours.

yxxhero avatar Jul 08 '25 12:07 yxxhero

@yxxhero I don't explicitly set this argument, probably it's helmfile behaviour. As I said previosly here are my current helm-diff related env vars:

% env | grep HELM_DIFF
HELM_DIFF_COLOR=true
HELM_DIFF_NORMALIZE_MANIFESTS=true
HELM_DIFF_OUTPUT_CONTEXT=5
HELM_DIFF_THREE_WAY_MERGE=true
HELM_DIFF_USE_UPGRADE_DRY_RUN=true

z0rc avatar Jul 08 '25 13:07 z0rc

@z0rc so could you post the logs when HELM_DIFF_USE_UPGRADE_DRY_RUN=false on helm-diff 3.12.2?

yxxhero avatar Jul 08 '25 23:07 yxxhero

Here are logs when HELM_DIFF_USE_UPGRADE_DRY_RUN is unset and when HELM_DIFF_USE_UPGRADE_DRY_RUN=false

% env | grep HELM
HELM_DIFF_NORMALIZE_MANIFESTS=true
HELM_DIFF_THREE_WAY_MERGE=true
HELM_DIFF_COLOR=true
HELM_DIFF_OUTPUT_CONTEXT=5
HELM_DIFF_USE_UPGRADE_DRY_RUN=true
% unset HELM_DIFF_USE_UPGRADE_DRY_RUN
% env | grep HELM
HELM_DIFF_NORMALIZE_MANIFESTS=true
HELM_DIFF_THREE_WAY_MERGE=true
HELM_DIFF_COLOR=true
HELM_DIFF_OUTPUT_CONTEXT=5
% helmfile --debug apply &>helmfile_debug_3.12.2_dry_run_unset.txt
% export HELM_DIFF_USE_UPGRADE_DRY_RUN=false
% env | grep HELM
HELM_DIFF_NORMALIZE_MANIFESTS=true
HELM_DIFF_THREE_WAY_MERGE=true
HELM_DIFF_COLOR=true
HELM_DIFF_OUTPUT_CONTEXT=5
HELM_DIFF_USE_UPGRADE_DRY_RUN=false
% helmfile --debug apply &>helmfile_debug_3.12.2_dry_run_false.txt

helmfile_debug_3.12.2_dry_run_unset.txt

helmfile_debug_3.12.2_dry_run_false.txt

As far as I can tell this didn't have any effect.

z0rc avatar Jul 09 '25 09:07 z0rc

HELM_DIFF_THREE_WAY_MERGE=false

yxxhero avatar Jul 09 '25 10:07 yxxhero

@z0rc

yxxhero avatar Jul 09 '25 10:07 yxxhero

@yxxhero it is provided in my post already.

% env | grep HELM HELM_DIFF_NORMALIZE_MANIFESTS=true HELM_DIFF_THREE_WAY_MERGE=true HELM_DIFF_COLOR=true HELM_DIFF_OUTPUT_CONTEXT=5 HELM_DIFF_USE_UPGRADE_DRY_RUN=false % helmfile --debug apply &>helmfile_debug_3.12.2_dry_run_false.txt

z0rc avatar Jul 09 '25 10:07 z0rc

@z0rc I mean that you set HELM_DIFF_THREE_WAY_MERGE=false

yxxhero avatar Jul 12 '25 01:07 yxxhero

@yxxhero here is result of HELM_DIFF_THREE_WAY_MERGE=false helmfile apply --debug &> helmfile_debug_3.12.2_three_way_merge_false.txt:

helmfile_debug_3.12.2_three_way_merge_false.txt

It didn't have any effect of the issue.

z0rc avatar Jul 14 '25 12:07 z0rc

Unsetting both HELM_DIFF_THREE_WAY_MERGE and HELM_DIFF_USE_UPGRADE_DRY_RUN env vars seems to address the issue for reproduction case. I'll experiment with unsetting this options for production workloads and will report later.

helmfile_debug_3.12.2_unset.txt

z0rc avatar Jul 14 '25 12:07 z0rc

So far unsetting both HELM_DIFF_THREE_WAY_MERGE and HELM_DIFF_USE_UPGRADE_DRY_RUN works for my helmfile use cases.

I might revisit this later, as I have deployments that don't update too often. But for now I consider it's resolved, so closing the issue. Thanks for your assistance on this matter.

z0rc avatar Jul 15 '25 14:07 z0rc