flux2
flux2 copied to clipboard
Breaking changes in Flux due to Kustomize v4
Starting with version 0.15.0, Flux and its controllers have been upgraded to Kustomize v4. While Kustomize v4 comes with many improvements and bug fixes, it introduces a couple of breaking changes.
Remote archives
Due to the removal of hashicorp/go-getter
from Kustomize v4, the set of URLs accepted by Kustomize in the resources
filed is reduced to file system paths, URLs to plain YAMLs and values compatible with git clone
.
This means you can no longer use resources from archives (zip, tgz, etc).
No longer works:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://github.com/rook/rook/archive/refs/heads/master.zip//rook-master/cluster/examples/kubernetes/ceph/crds.yaml
Works:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://raw.githubusercontent.com/rook/rook/v1.6.0/cluster/examples/kubernetes/ceph/crds.yaml
Non-string YAML keys
Due to a bug in Kustomize v4, if you have non-string keys in your manifests, the controller will fail to build the final manifest.
The non-string keys bug affects Helm release like the nginx-ingress one, for example:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: nginx-ingress
spec:
values:
tcp:
2222: "app/server:2222"
The above will fail with {}{2222:"app/server:2222:2222"}}}}: json: unsupported type: map[interface {}]interface {}
.
To fix this issue, you have to make the YAML keys into strings, e.g.:
values:
tcp:
"2222": "app/server:2222"
Duplicate YAML keys
Unlike Helm, the Kustomize yaml parser (kyaml) does not accept duplicate keys, while Helm drops the duplicates, Kustomize errors out. This impacts helm-controller as it uses kustomize/kyaml
to label objects reconciled by a HelmRelease
.
For example, a chart that adds the app.kubernetes.io/name
more than once, will result in a HelmRelease install failure:
map[string]interface {}(nil): yaml: unmarshal errors:
line 21: mapping key "app.kubernetes.io/name" already defined at line 20
YAML formatting
Due to a bug in Kustomize v4 that makes the image-automation-controller crash when YAMLs contain non-ASCII characters, we had to update the underlying go-yaml package to fix the panics.
The gopkg.in/yaml.v3
update means that the indentation style changed:
From:
spec:
containers:
- name: one
image: image1:v1.0.0 # {"$imagepolicy": "automation-ns:policy1"}
- name: two
image: image2:v1.0.0 # {"$imagepolicy": "automation-ns:policy2"}
To:
spec:
containers:
- name: one
image: image1:v1.0.0 # {"$imagepolicy": "automation-ns:policy1"}
- name: two
image: image2:v1.0.0 # {"$imagepolicy": "automation-ns:policy2"}
Due to the removal of hashicorp/go-getter from Kustomize v4, the set of URLs accepted by Kustomize in the resources filed is reduced to only file system paths or values compatible with git clone. This means you can no longer use resources from archives (zip, tgz, etc).
Does this mean standard URLs do not work anymore? e.g.
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
commonLabels:
grafana_dashboard: "1"
resources:
- https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/v0.8.0/manifests/grafana-dashboardDefinitions.yaml
@onedr0p I've update the issue with examples, let me know if it answers your question.
Due to the removal of hashicorp/go-getter from Kustomize v4...
🤦♂️
I am pretty shocked how easily the kustomize crowd breaks established standards, given how dogmatic they are about their templating philosophy. On that note, maybe flux should not include such massive breaking changes in minor releases.
From an operational perspective, this is a nightmare. I guess we will stay on flux 0.14.0 for some time until this has settled.
@stefanprodan Thank you for pushing back on this and clearly documenting the impacts. 👍
maybe flux should not include such massive breaking changes in minor releases.
You may not be aware, but flux2 has no GA release so we can't bump the major version before going GA aka 2.0.0. Every minor release of flux2 could come with breaking changes, we try to communicate those ahead of time, in this case with Kustomize v4, I documented the whole thing months ago here: https://github.com/fluxcd/flux2/issues/918
I've faced with "Non-string YAML keys" problem, but in the context of helm itself, more specifically a helm template has integer key. As far as I understand this is because of post-rendering kustomization, so basically helm-controller renders helm templates and then runs kustomization - am I correct?
The helm-controller does run a default Kustomize plugin to be able to trace resources that originate from a HelmRelease
by adding labels.
The impact of this may however been underestimated with the recent changes to Kustomize v4, and we may want to provide some sort of configuration flag to disable this default behavior for charts it does not cope with.
I stumbled upon the "Duplicate YAML keys" problem in one of my releases. Fixing it is rather easy.
I'm a little concerned about how to avoid this kind of failure in the future. What is the test that needs to be added to CI so it would break before merge to master/develop
I found a very awkward way to do it, but I wonder if someone found something more sustainable...
@or-shachar I also have the same issue. How did you workaround this?
I need to update serviceMonitor
key in the the values for the HelmRelease. Orginally I did it this way:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: promtail
namespace: log
spec:
interval: 1h
chart:
spec:
chart: promtail
sourceRef:
kind: HelmRepository
name: grafana
namespace: flux-system
values:
serviceMonitor:
enabled: true
labels:
release: mon
But now I get the "Duplicate YAML keys" error...
I have two initContainers in my deployment and I cannot proceed (I can merge the command, but still, this is one of the first apps I'm porting to flux 2, I don't want to guess the issues I'll find in other ones), is this bug related?
{"level":"error","ts":"2021-09-09T22:35:34.794Z","logger":"controller.helmrelease","msg":"Reconciler error","reconciler group":"helm.toolkit.fluxcd.io","reconciler kind":"HelmRelease","name":"sendy","namespace":"sendy","error":"Helm upgrade failed: error while running post render on
│ files: map[string]interface {}(nil): yaml: unmarshal errors:\n line 93: mapping key \"name\" already defined at line 91\n line 107: mapping key \"name\" already defined at line 105"}
initContainers:
- name: create-csvs
image: "{{ .Values.image }}"
command:
- mkdir
- -p
- /var/www/html/sendy/uploads/csvs
volumeMounts:
- name: data
mountPath: /var/www/html/sendy/uploads
name: sendy-data
- name: take-data-dir-ownership
image: "{{ .Values.image }}"
command:
- chown
- -R
- www-data:www-data
- /var/www/html/sendy/uploads
volumeMounts:
- name: data
mountPath: /var/www/html/sendy/uploads
name: sendy-data
@masterkain:
is this bug related?
No.
That is an issue with your yaml, you are defining volumeMounts
names twice for both initcontainers.
volumeMounts:
- name: data #### Here...
mountPath: /var/www/html/sendy/uploads
name: sendy-data #### ... and here.
thanks @endrec, that definitely slipped under my tired eyes 👍
Looks like this is fixed in upstream now: kubernetes-sigs/kustomize#3675
Can't wait for the update to this newer version of kustomize in Flux, anchor support is amazing.
And there's alreay a new kustomize release which includes the fix: https://github.com/kubernetes-sigs/kustomize/releases/tag/kustomize%2Fv4.4.0
@stefanprodan Here's the comment as per your request
I noticed this by accident, and I've been lucky for now that it hasn't caused issues yet, but I think it's just a matter of time.
kustomize
wraps lines that are longer than 80 chars in the resulting YAML manifests, meaning from the 81st character the line continues in a new line.
This still happens with the latest version of kustomize
(4.4.0
).
There is a PR open on the kustomize
repo to fix this, but it's missing something before it's ready to be merged
https://github.com/kubernetes-sigs/kustomize/pull/4222
@stefanprodan if i use this version of kustomize:
➜ kustomize version
{Version:kustomize/v4.4.0 GitCommit:63ec6bdb3d737a7c66901828c5743656c49b60e1 BuildDate:2021-09-27T16:13:36Z GoOs:darwin GoArch:amd64}
I do not get the dupe key error. But with current flux v0.20.1, I am seeing the dupe key error on an ingress-nginx spec.
➜ flux get kustomization ingress-nginx
NAME READY MESSAGE REVISION SUSPENDED
ingress-nginx False Deployment/ingress-nginx/ingress-nginx-controller dry-run failed, error: failed to create manager for existing fields: failed to convert new object (apps/v1, Kind=Deployment) to smd typed: .spec.template.spec.containers[name="controller"].ports: duplicate entries for key [containerPort=80,protocol="TCP"] main/5f7e56ed5328481798d7feff415036e220d32178 False
However, the section of the spec it is complaining about does not have a dupe entry:
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 443
protocol: TCP
- name: tohttps
containerPort: 2443
protocol: TCP
- name: webhook
containerPort: 8443
protocol: TCP
It used to. Before it was specified like this:
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 80
protocol: TCP
- name: tohttps
containerPort: 2443
protocol: TCP
- name: webhook
containerPort: 8443
protocol: TCP
...but I pushed a commit to change the port for https
from 80
to 443
, then did flux get kustomization ingress-nginx --with-source
, but it is still complaining about the same dupe key entry. Could it be cached?
We're getting umpteen million alerts in a slack chan on this, so I'm trying to make it go away.
Hi @stefanprodan, We are affected with the duplicate keys issue. We are actually using Helm charts from another company we work with. Would there be a way maybe to to ignore those errors?
Would there be a way maybe to to ignore those errors?
No there is no workaround, I think soon Helm itself will error out the same as Flux once they update their yaml processor package.
I stumbled upon the "Duplicate YAML keys" problem in one of my releases. Fixing it is rather easy.
I'm a little concerned about how to avoid this kind of failure in the future. What is the test that needs to be added to CI so it would break before merge to master/develop
I found a very awkward way to do it, but I wonder if someone found something more sustainable...
and what is your way?
I really need a workaround for charts that have lots of duplicate keys in them, like GitLab.
@or-shachar can you share your method please? @marianobilli have you found a workaround?
I guess one option might be to do a Kustomize PostRenderer to patch all of the effected yamls, but that would take AGES.
EDIT: Tried fixing with Kustomize PostRenderer and it doesn't seem to work, it errors out before even attempting the patch.
I really need a workaround for charts that have lots of duplicate keys in them, like GitLab.
@or-shachar can you share your method please? @marianobilli have you found a workaround?
I guess one option might be to do a Kustomize PostRenderer to patch all of the effected yamls, but that would take AGES.
EDIT: Tried fixing with Kustomize PostRenderer and it doesn't seem to work, it errors out before even attempting the patch.
I had to fix the helm template with the duplicate keys.
I just upgraded to v0.29.0 and noticed that a kustomization like this is no longer supported, is this safe to assume now?
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- github.com/rancher/system-upgrade-controller?ref=v0.9.1
I looked into it a bit more and discvoered this PR https://github.com/kubernetes-sigs/kustomize/pull/4453 which makes it seem like the below would work
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- git::https://github.com/rancher/system-upgrade-controller?ref=v0.9.1
But I am getting an error:
kustomize build failed: accumulating resources: accumulation err='accumulating resources from 'system-upgrade': read /tmp/apps2651878754/cluster/apps/system-upgrade: is a directory': recursed accumulation of path '/tmp/apps2651878754/cluster/apps/system-upgrade': accumulating resources: accumulation err='accumulating resources from 'system-upgrade-controller': read /tmp/apps2651878754/cluster/apps/system-upgrade/system-upgrade-controller: is a directory': recursed accumulation of path '/tmp/apps2651878754/cluster/apps/system-upgrade/system-upgrade-controller': accumulating resources: accumulation err='accumulating resources from 'git::https://github.com/rancher/system-upgrade-controller?ref=v0.9.1': open /tmp/apps2651878754/cluster/apps/system-upgrade/system-upgrade-controller/git::https:/github.com/rancher/system-upgrade-controller?ref=v0.9.1: no such file or directory': fs-security-constraint abs /tmp/kustomize-356128291: path '/tmp/kustomize-356128291' is not in or below '/tmp/apps2651878754'
@onedr0p this is a newly introduced security constraint set too tight. I will get this sorted now, and ensure a regression test is added.
Have a confirmed fix, but need to do the required writing of more extensive tests. Will aim to have it available before EOD UTC.
A computer is currently doing the required work to produce a ghcr.io/fluxcd/kustomize-controller:v0.24.1
image. Once available, I will patch the Flux CLI as soon as CI allows me to. When impatient, manually patching the kustomize-controller version in your Git repository would be a workaround.
Users running into issues after updating to v0.29.0
, should see smooth operation again with v0.29.1
. Sorry about any inconvenience it may have caused, Terraform provider release will follow shortly.
Hi, day 1 as a new user, I'm wondering if this report belongs here. I am also seeing a duplicate key.
❯ flux get kustomizations --watch
NAME REVISION SUSPENDED READY MESSAGE
flux-system main/bb9797709c06bde527638850c0d29e91475bb057 False False Node/k0 dry-run failed, error: failed to create manager for existing fields: failed to convert new object (/v1, Kind=Node) to smd typed: .status.addresses: duplicate entries for key [type="InternalIP"]
I do indeed have multiple InternalIP addresses.
❯ k get node k0 -o json | jq '.status.addresses'
[
{
"address": "172.16.15.20",
"type": "InternalIP"
},
{
"address": "fc15::20",
"type": "InternalIP"
},
{
"address": "k0",
"type": "Hostname"
}
]
I'm getting similar error with flux version 0.30.2
. It used to work with 0.28.0
.
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- gotk-components.yaml
- gotk-sync.yaml
- ../../base
patchesStrategicMerge:
- ../../gotk-patches.yaml
✗ accumulating resources: accumulation err='accumulating resources from '../../base': fs-security-constraint read /tmp/flux-bootstrap-3660114182/clusters/base: path '/tmp/flux-bootstrap-3660114182/clusters/base' is not in or below '/tmp/flux-bootstrap-3660114182/clusters/staging'': fs-security-constraint abs /tmp/flux-bootstrap-3660114182/clusters/base: path '/tmp/flux-bootstrap-3660114182/clusters/base' is not in or below '/tmp/flux-bootstrap-3660114182/clusters/staging'
@tomaszduda23 can you confirm the directory structure in https://github.com/fluxcd/kustomize-controller/pull/657 matches yours? As the tests for this appear to pass.