kata-deploy: Add Helm Chart
For easier handling of kata-deploy we can leverage a Helm chart to get rid of all the base and overlays for the various components
Per default the appVersion would be set to VERSION one can override it by simply saying:
helm install --set image.tag=latest kata-deploy-0.1.0.tgz
A default
helm install ./kata-deploy-0.1.0.tgz
would give one the latest release.
The helm-chart can be eaisly hosted via github.io on kata-containers.
With this we can also automate the chart publishing for each release https://github.com/helm/chart-releaser
@beraldoleal Take a look at the second commit how we can use Helm for rendering the correct yamls without changing the yamls in place and hence making the repository dirty.
FYI @fidencio @ryansavino
kata-deploy.yaml and kata-cleanup.yaml are the same manifests and scripts just with different arguments next commit will clean this up
cleanup_kata_deploy is now really simple, just uninstall the helm release if found. All the needed data is encapsulated in the deployed release.
With this we should be able to remove "ALL" base/overlay kustomzie manifests.
TODO follow up PRs:
- [ ] Add Helm chart publishing to release.yaml
- [ ] Add Helm chart for each PR build
- [ ] Latest PR build will also upload a latest Helm chart.
- [ ] We need a "place" where to publish our Helm charts. I would be nice to have one Helm chart per PR
I've build a simple GHA job to upload a Helm chart for each PR automatically to a specific repository. In my case its zvonkok/helm-charts those are hosted via github.io.
Replace here zvonkok/helm-charts with kata-containers/helm-charts moving forward just using zvonkok/helm-charts for illustration purposes.
helm repo add kata-containers https://zvonkok.github.io/helm-charts
helm repo update
helm search repo --devel -o json
[{"name":"kata-containers/kata-deploy","version":"3.6.0-dev+24-4d45fd9818726d7d3a37cfd4ad1281bea29b67c2","app_version":"3.6.0-dev+24-4d45fd9818726d7d3a37cfd4ad1281bea29b67c2","description":"A Helm chart for deploying Kata Containers"}]
The input.tag (pr-githash) is defined by the GHA and I am using the VERSION + the tag to create a new sermer version for Helm to consume.
Additionally values.yaml would be updated for the chart to point to the correct payload
imagePullPolicy: Always
imagePullSecrets: []
image:
reference: ghcr.io/zvonkok/kata-deploy-ci/kata-deploy
tag: 24-b9d7f9333087e5ee789af8f37dd26df3b3e308e0
# k8s-dist can be k8s, k3s, rke2, k0s
k8sDistribution: "k8s"
env:
debug: "false"
shims: "clh cloud-hypervisor dragonball fc qemu qemu-coco-dev qemu-runtime-rs qemu-sev qemu-snp qemu-tdx stratovirt qemu-nvidia-gpu qemu-nvidia-gpu-snp qemu-nvidia-gpu-tdx"
defaultShim: "qemu"
createRuntimeClasses: "false"
createDefaultRuntimeClass: "false"
allowedHypervisorAnnotations: ""
snapshotterHandlerMapping: ""
agentHttpProxy: ""
agentNoProxy: ""
pullTypeMapping: ""
hostOS: ""
For each PR we woudl have a Helm chart kata-deploy-VERSION-dev+{{ input.tag }},tar.gz and for each release we would have a kata-deploy-VERSION.tar.gz with updated values.yaml also pushed as an artifact in the release payload.
Users can then do a helm repo update and without the --devel flag they will not see any kata-deploy-VERSION-dev+{{ input.tag }},tar.gz charts, only the release charts.
I was looking through the failed test runs for the amd node jobs. Let me know if you want to spend some time troubleshooting the failures together. Looks like it wasn't able to pull the kata-deploy image.
@ryansavino Found the error. Thanks for the offer :)
@mkulke Good point, adding this to the list of follow up items: https://github.com/kata-containers/kata-containers/issues/9924
Please rebase this PR onto main when you want to re-trigger the whole set of checks (by pushing something and etc.) as #9923 resolves the issue for the zvsi tests. Thanks.
@zvonkok, it's taking me some to get to this, but I'd like to ensure this works with TDX first. I will be force-pushing to your branch.
I forced-pushed here to rebase.
Found the issue, agent.https_proxy is not being properly set!
git diff
diff --git a/tools/packaging/kata-deploy/helm-chart/kata-deploy/templates/kata-deploy.yaml b/tools/packaging/kata-deploy/helm-chart/kata-deploy/templates/kata-deploy.yaml
index 714e172e44..0d3565da38 100644
--- a/tools/packaging/kata-deploy/helm-chart/kata-deploy/templates/kata-deploy.yaml
+++ b/tools/packaging/kata-deploy/helm-chart/kata-deploy/templates/kata-deploy.yaml
@@ -47,7 +47,7 @@ spec:
- name: SNAPSHOTTER_HANDLER_MAPPING
value: {{ .Values.env.snapshotterHandlerMapping | quote }}
- name: AGENT_HTTPS_PROXY
- value: {{ .Values.env.agentHttpProxy | quote }}
+ value: {{ .Values.env.agentHttpsProxy | quote }}
- name: AGENT_NO_PROXY
value: {{ .Values.env.agentNoProxy | quote }}
- name: PULL_TYPE_MAPPING
diff --git a/tools/packaging/kata-deploy/helm-chart/kata-deploy/values.yaml b/tools/packaging/kata-deploy/helm-chart/kata-deploy/values.yaml
index 004137a147..b1f195d1f1 100644
--- a/tools/packaging/kata-deploy/helm-chart/kata-deploy/values.yaml
+++ b/tools/packaging/kata-deploy/helm-chart/kata-deploy/values.yaml
@@ -13,7 +13,7 @@ env:
createDefaultRuntimeClass: "false"
allowedHypervisorAnnotations: ""
snapshotterHandlerMapping: ""
- agentHttpProxy: ""
+ agentHttpsProxy: ""
agentNoProxy: ""
pullTypeMapping: ""
hostOS: ""
This will solve the issue.
Force pushed, probably fixing the issue.
For snp, the k8s-policy-hard-coded.bats test keeps failing. Looks like the pods aren't starting. I think this is probably unrelated, but I'd like to troubleshoot and diagnose a bit further. The nightly CI seems to be passing fine.
Do you think a rebase may help here?
For snp, the
k8s-policy-hard-coded.batstest keeps failing. Looks like the pods aren't starting. I think this is probably unrelated, but I'd like to troubleshoot and diagnose a bit further. The nightly CI seems to be passing fine.
What happened here was that the PR was rebased before the commit adding the test was added, and at that time the SNP CI was broken, with failures happening even before starting the tests. You guys fixed the issue there, but meanwhile the test you mentioned was merged, and the auto-rebase would pick that up as part of the tests to run, and then we ended up with that test failing.
I've rebased now, and this should give us a green CI everywhere.
For snp, the
k8s-policy-hard-coded.batstest keeps failing. Looks like the pods aren't starting. I think this is probably unrelated, but I'd like to troubleshoot and diagnose a bit further. The nightly CI seems to be passing fine.What happened here was that the PR was rebased before the commit adding the test was added, and at that time the SNP CI was broken, with failures happening even before starting the tests. You guys fixed the issue there, but meanwhile the test you mentioned was merged, and the auto-rebase would pick that up as part of the tests to run, and then we ended up with that test failing.
I've rebased now, and this should give us a green CI everywhere.
Great. Thanks for explaining that. I was a bit confused. Approving.