garden icon indicating copy to clipboard operation
garden copied to clipboard

Garden does not support all k8s resources within Helm charts

Open kszymans opened this issue 3 years ago • 12 comments

Bug

Current Behavior

I am trying install my own Helm charts with k8s jobs only, with garden deploy. Unfortunately it keeps stuck at the deployment phase, although Helm release has been already deployed. Same story is with pods. When I change resource to deployment garden won't stuck since detects status of deployment. I think this is related to serviceResource limitation which supports: "Deployment", "DaemonSet", "StatefulSet" only

Expected behavior

Garden should support also other k8s resources from Helm charts

Reproducible example

Here is the snippet of the helm chart files: https://snipit.io/public/snippets/61497 When I try deploy them with garden deploy command it's stuck and timeouts after default 300 seconds of Helm release process

Workaround

Add dummy deployment resource to Helm charts.

Suggested solution(s)

I think garden should watch the Helm releases status at Helm release level not at the k8s resources like Deployments or StatefuSets

Additional context


Your environment

  • OS: macOS BigSur
  • How I'm running Kubernetes: GKE

garden version 0.12.33

kszymans avatar Dec 27 '21 09:12 kszymans

Hi @kszymans , thanks for filing the issue. I am able to reproduce this with your snippet. I can also confirm that deploying the same job as a kubernetes module works without problems. I believe this has to do with the way we wait for resources to become ready in helm modules. Thoughts @thsig @edvald ?

twelvemo avatar Jan 04 '22 07:01 twelvemo

This issue has been automatically marked as stale because it hasn't had any activity in 90 days. It will be closed in 14 days if no further activity occurs (e.g. changing labels, comments, commits, etc.). Please feel free to tag a maintainer and ask them to remove the label if you think it doesn't apply. Thank you for submitting this issue and helping make Garden a better product!

stale[bot] avatar Apr 16 '22 17:04 stale[bot]

Hello. Anything new in this matter? I again met this problem with different use case so it would be really helpful to fix this.

kszymans avatar May 31 '22 08:05 kszymans

This issue has been automatically marked as stale because it hasn't had any activity in 90 days. It will be closed in 14 days if no further activity occurs (e.g. changing labels, comments, commits, etc.). Please feel free to tag a maintainer and ask them to remove the label if you think it doesn't apply. Thank you for submitting this issue and helping make Garden a better product!

stale[bot] avatar Sep 21 '22 03:09 stale[bot]

@kszymans I just ran into this issue myself when deploying the CockroachDB chart, which makes significant use of Kubernetes Jobs. Templating the chart out to a rendered manifest (mostly) worked though I need to remove the cockroachdb-test pod definition for Garden to deploy without error.

worldofgeese avatar Dec 12 '22 22:12 worldofgeese

After looking at this closer with @worldofgeese we discovered the following things:

  • Garden gets stuck waiting for helm install
  • Garden passes the flag --atomic to helm install, which implies the install option --wait. This is the culprit
  • We were able to reproduce the issue with helm install --atomic getting stuck without garden as well.

We discovered the following Workaround, add atomicInstall: false to the Garden helm module

# garden.yaml
---
kind: Module
type: helm
name: test
atomicInstall: false # this is the workaround
releaseName: test-relese
description: test
chart: test-it
chartPath: chart/

TLDR: This is a helm issue. TODO: report upstream in github.com/helm/helm

My open questions:

  • Why do we use the --atomic flag in the helm module? Maybe the default could be atomicInstall: false

stefreak avatar Dec 14 '22 11:12 stefreak

@kszymans does that workaround work for you?

stefreak avatar Dec 14 '22 13:12 stefreak

Adding triage/accepted because this is reproducable and we need to consider wether we keep using --atomic by default for helm install

stefreak avatar Dec 14 '22 13:12 stefreak

This issue has been automatically marked as stale because it hasn't had any activity in 90 days. It will be closed in 14 days if no further activity occurs (e.g. changing labels, comments, commits, etc.). Please feel free to tag a maintainer and ask them to remove the label if you think it doesn't apply. Thank you for submitting this issue and helping make Garden a better product!

stale[bot] avatar May 22 '23 05:05 stale[bot]

@kszymans @stefreak @worldofgeese the latest Garden versions (both 0.12 and 0.13) use Helm CLI 3.12. Is this still an issue?

vvagaytsev avatar Jul 04 '23 12:07 vvagaytsev

@worldofgeese can you share with us the helm command (and code? or is it a chart?) that reproduces the issue of getting stuck with --atomic?

stefreak avatar Jul 10 '23 08:07 stefreak

@stefreak should be at https://github.com/worldofgeese/flask-cockroachdb-devcontainer/blob/20e9223c1d48bcb0b5158b07c19fc5c4d6f0bea7/modules.garden.yml#L65

This chart uses the workaround we used, atomicInstall: false # this is the workaround, but you should be able to remove that line to reproduce.

worldofgeese avatar Jul 11 '23 12:07 worldofgeese

This issue is still occuring: https://discord.com/channels/817392104711651328/1214247266576113674/1214247266576113674.

@vvagaytsev Should we consider disabling atomicInstall by default in Helm?

stefreak avatar Mar 05 '24 13:03 stefreak

@stefreak it looks like the default value of atomic flag was flipped in 2d468cc5. That change was released in 0.13.

@kszymans have you tried Garden 0.13 so far? We recommend to use the latest available release, the current one in 0.13.27.

vvagaytsev avatar Mar 05 '24 13:03 vvagaytsev

Closing this for now. Feel free to reopen if it's still an issue in 0.13.

vvagaytsev avatar Mar 12 '24 12:03 vvagaytsev

A-ha, atomic install option is still enabled by default for Module-based configs, i.e. for 0.12 config syntax. I think we should flip that as well. We can do it in 0.13 only, and leave 0.12 with the previous behavior.

vvagaytsev avatar Apr 24 '24 10:04 vvagaytsev