terraform-provider-helm icon indicating copy to clipboard operation
terraform-provider-helm copied to clipboard

dag/walk is waiting for something

Open davhdavh opened this issue 1 year ago • 14 comments

Terraform, Provider, Kubernetes and Helm Versions

Terraform version:v1.4.6
Provider version:2.9.0
Kubernetes version:1.27

Affected Resource(s)

  • helm_release

Terraform Configuration Files

provider "helm" {
  kubernetes {
    config_path  = var.k8sContext
  }
}

resource "helm_release" "metrics-server" {
  repository = "https://kubernetes-sigs.github.io/metrics-server/"
  chart      = "metrics-server"
  name       = "metrics-server"
  version    = "3.10.0"
}

Debug Output

2023-05-27T18:39:01.800+0700 [INFO]  provider.terraform-provider-helm_v2.7.0_x5: 2023/05/27 18:39:01 [DEBUG] [INFO] GetHelmConfiguration success: timestamp=2023-05-27T18:39:01.800+0700
2023-05-27T18:39:06.174+0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
2023-05-27T18:39:06.174+0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/helm\"] (close)" is waiting for "helm_release.metrics-server (expand)"
2023-05-27T18:39:06.796+0700 [TRACE] dag/walk: vertex "root" is waiting for "helm_release.metrics-server"
2023-05-27T18:39:11.174+0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/helm\"] (close)" is waiting for "helm_release.metrics-server (expand)"
2023-05-27T18:39:11.174+0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
2023-05-27T18:39:11.796+0700 [TRACE] dag/walk: vertex "root" is waiting for "helm_release.metrics-server"
2023-05-27T18:39:16.175+0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
2023-05-27T18:39:16.175+0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/helm\"] (close)" is waiting for "helm_release.metrics-server (expand)"
2023-05-27T18:39:16.797+0700 [TRACE] dag/walk: vertex "root" is waiting for "helm_release.metrics-server"

Steps to Reproduce

  1. manually install some CRD.. e.g. kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
  2. terraform plan
  3. wait forever

Expected Behavior

Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "system:metrics-server" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "metrics-server"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "default"

Actual Behavior

hang until timeout

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

davhdavh avatar May 27 '23 11:05 davhdavh

I seem to have found the problem. It would be very helpful if the log actually specified WHY it is waiting, instead of just useless "dag/walk is waiting". My problem was the load balancer server was pending a public IP (which is not necessary to have a functioning cluster, so helm shouldnt be waiting for it in the first place)

davhdavh avatar May 29 '23 03:05 davhdavh

and seems I was wrong, the problem is still there, but it helped...

It is a race condition of some kind. It happens at the exact same line in the log, but lots of little differences that shows it is running multiple threads sometimes it will work fine, and logs shows:

[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [resourceDiff: metrics-server] Start:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [INFO] GetHelmConfiguration start:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] Using kubeconfig: /somewhere/.kube/config-test:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [INFO] Successfully initialized kubernetes config:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [INFO] GetHelmConfiguration success:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [resourceDiff: metrics-server] Got chart:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] Chart dependencies are up to date.:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [resourceDiff: metrics-server] Release validated:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: ---[ values.yaml ]-----------------------------------								   

and some times it will just wait forever:

[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [resourceDiff: metrics-server] Start:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [INFO] GetHelmConfiguration start:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] Using kubeconfig: /somewhere/.kube/config-test:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [INFO] Successfully initialized kubernetes config:
[INFO]  provider.terraform-provider-helm_v2.9.0_x5: [DEBUG] [INFO] GetHelmConfiguration success:
[TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/helm\"] (close)" is waiting for "helm_release.metrics-server (expand)"
[TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
[TRACE] dag/walk: vertex "root" is waiting for "helm_release.metrics-server"

setting -parallelism=1 on terraform also helps, then it only happens 50% of the time

davhdavh avatar May 29 '23 04:05 davhdavh

A bit of check of all version and and I determine that the bug was introduced in release 1.3.1. Running in v1.3.0 works every single time.

davhdavh avatar May 29 '23 05:05 davhdavh

Can you please clarify what versions 1.3.1 and 1.3.0 are referring to? Terraform version? Provider version? Chart version?

alexsomesan avatar May 31 '23 13:05 alexsomesan

Terraform version of the helm Provider. The one that upgrades to use helm 3.3.4 I think the changelog said.

davhdavh avatar Jun 01 '23 03:06 davhdavh

Can you confirm if it behaves the same when you set wait = false on the helm_release resource? https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release#wait

alexsomesan avatar Jun 02 '23 15:06 alexsomesan

Yes. It happens during the plan phase. Wait applies to the apply phase

On Fri, Jun 2, 2023, 22:45 Alex Somesan @.***> wrote:

Can you confirm if it behaves the same when you set wait = false on the helm_release resource? https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release#wait

— Reply to this email directly, view it on GitHub https://github.com/hashicorp/terraform-provider-helm/issues/1144#issuecomment-1573948892, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFIDJAISPTSEMZJKU2XYDXJIDA5ANCNFSM6AAAAAAYREG2F4 . You are receiving this because you authored the thread.Message ID: @.***>

davhdavh avatar Jun 06 '23 17:06 davhdavh

I have the same problem, during plan phase:

 (expand)"
2023-06-20T18:03:12.651Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
2023-06-20T18:03:12.667Z [TRACE] dag/walk: vertex "root" is waiting for "helm_release.gitlab_runner"
2023-06-20T18:03:17.355Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/helm\"] (close)" is waiting for "helm_release.gitlab_runner (expand)"
2023-06-20T18:03:17.651Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
2023-06-20T18:03:17.668Z [TRACE] dag/walk: vertex "root" is waiting for "helm_release.gitlab_runner"
2023-06-20T18:03:22.355Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/helm\"] (close)" is waiting for "helm_release.gitlab_runner (expand)"
2023-06-20T18:03:22.652Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
2023-06-20T18:03:22.668Z [TRACE] dag/walk: vertex "root" is waiting for "helm_release.gitlab_runner"
2023-06-20T18:03:27.360Z [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/helm\"] (close)" is waiting for "helm_release.gitlab_runner (expand)"
2023-06-20T18:03:27.652Z [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/helm\"] (close)"
2023-06-20T18:03:27.668Z [TRACE] dag/walk: vertex "root" is waiting for "helm_release.gitlab_runner"

Forcing the provider to "<=1.30" didn't help in my case. Any idea on how I can debug this further?

rdbisme avatar Jun 20 '23 18:06 rdbisme

Any solution already to this pls, i am facing the issue also..... Tried many ways but still not able to work

jackliu2006 avatar Jun 26 '23 07:06 jackliu2006

Downgrade terraform to v1.2.9 does not help too. Terraform v1.2.9 on windows_386

  • provider registry.terraform.io/hashicorp/azurerm v3.62.1

jackliu2006 avatar Jun 26 '23 07:06 jackliu2006

As a mitigation, downloading the chart locally (using helm pull) and pointing to the local chart seems to work way faster.

rdbisme avatar Jun 26 '23 14:06 rdbisme

Is there any solution?

wolfdate25 avatar Jul 10 '23 01:07 wolfdate25

I could repro it in helm, so it is not a terraform provider issue. https://github.com/helm/helm/issues/12108

davhdavh avatar Jul 10 '23 03:07 davhdavh

I solved this problem. It occurred using WSL2 with terraform. See https://github.com/microsoft/WSL/issues/8022#issuecomment-1221617827

wolfdate25 avatar Jul 19 '23 10:07 wolfdate25