tofu-controller icon indicating copy to clipboard operation
tofu-controller copied to clipboard

error pending plan and plan's name in the secret are not matched

Open github-vincent-miszczak opened this issue 3 years ago • 2 comments

Hello,

I have 3 terraform resources managed by tf-controller. 2 of them are fine, one reports

rpc error: code = Unknown desc = error pending plan and plan's name in the secret are not matched:  != plan-master-4246be8f0f8c0ffa86ea09b88d0000123a5562f0

When updating Git code, the updates do happen, but the message comes back with new commit ID.

Having a quick look, https://github.com/weaveworks/tf-controller/blob/3ae12dd2f2f82a5f89e3280ebb1e0a7b39393ecc/runner/server.go#L774 is an empty string, which looks fine has there no change to apply once applied successfully.

I need some help to fix this.

github-vincent-miszczak avatar Sep 09 '22 16:09 github-vincent-miszczak

Could provide the secret metadata of the corresponding terraform plan? It should be start with tfplan-default-.

tomhuang12 avatar Sep 09 '22 20:09 tomhuang12

metadata:
  annotations:
    encoding: gzip
    savedPlan: plan-master-4246be8f0f8c0ffa86ea09b88d0000123a5562f0
  creationTimestamp: "2022-09-09T16:00:58Z"
  name: tfplan-default-xxx-apne1-xxx
  namespace: production
  ownerReferences:
  - apiVersion: infra.contrib.fluxcd.io/v1alpha1
    kind: Terraform
    name: xxx-apne1-xxx
    uid: c4819818-4300-4041-9142-99db53098c2b
  resourceVersion: "32325387"
  uid: 08d80990-7c48-4d11-9f49-3b63cb6d5692

This is similar to other working resources, example of a working one:

metadata:
  annotations:
    encoding: gzip
    savedPlan: plan-master-4246be8f0f8c0ffa86ea09b88d0000123a5562f0
  creationTimestamp: "2022-09-09T16:00:05Z"
  name: tfplan-default-xxxx-use1-xxx
  namespace: production
  ownerReferences:
  - apiVersion: infra.contrib.fluxcd.io/v1alpha1
    kind: Terraform
    name: xxx-use1-xxx
    uid: ccdcfb78-045e-49cc-93ce-0ec6cc231999
  resourceVersion: "32325065"
  uid: 451a240b-b501-49c3-a53e-7740cd872ec8

github-vincent-miszczak avatar Sep 12 '22 10:09 github-vincent-miszczak

Hello, any update on this issue?

github-amine-kherbouche avatar Dec 21 '22 17:12 github-amine-kherbouche

i am facing a similar issue

{"level":"error","ts":"2023-01-13T17:48:40.106Z","msg":"Error, requeue job","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Terraform","Terraform":{"name":"external-secrets","namespace":"flux-system"},"namespace":"flux-system","name":"external-secrets","reconcileID":"a78b53d0-919e-4fb1-a563-384c542fd4ce","reconciliation-loop-id":"e91c76ea-47fd-41f1-8a10-ac6678131f16","start-time":"2023-01-13T17:48:15.346Z","function":"TerraformReconciler.finalize","error":"rpc error: code = Unknown desc = error pending plan and plan's name in the secret are not matched:  != plan-secrets-runtime-30944680a13b86b15e0ef34ed73fa89209a0c07a"}

enekofb avatar Jan 13 '23 17:01 enekofb

Sorry for the late response. It could be empty when Terraform.Status.Plan.Pending is empty while the plan is actually created. Did tf-controller restart during a plan that could cause that the status didn't get updated properly?

tomhuang12 avatar Jan 13 '23 18:01 tomhuang12

Sorry for the late response. It could be empty when Terraform.Status.Plan.Pending is empty while the plan is actually created. Did tf-controller restart during a plan that could cause that the status didn't get updated properly?

Did tf-controller restart during a plan that could cause that the status didn't get updated properly?

@tomhuang12

That is likely the context where it was produced ... the scenario was like

given tf-controller and provisioned module with permissions errors when updated tf-controller to configure irsa aws and restarted tf-controller (likely but not 100% sure this happened) then issue happened

enekofb avatar Jan 16 '23 09:01 enekofb

@github-vincent-miszczak did you also happen to restart the controller before the bug occurred?

chanwit avatar Jan 16 '23 11:01 chanwit

To fix this issue, we need to clear the pending plan before returning this error, here:

https://github.com/weaveworks/tf-controller/blob/76d1f1f0e13f1b239e911405c3853049284fa223/controllers/tf_controller_apply.go#L60-L67

chanwit avatar Jan 16 '23 11:01 chanwit

Sorry I can't reproduce the issue, I was able to apply new commits and I just moved on.

github-vincent-miszczak avatar Jan 16 '23 12:01 github-vincent-miszczak

@chanwit Hello,i am facing the same issue in v0.15.1,how can i solve it?thank you.

chimisu avatar Aug 29 '23 02:08 chimisu

Have you tried tfctl replan <object>?

It would help clear the pending plan information and trigger re-plan.

chanwit avatar Aug 29 '23 03:08 chanwit

Closing as we believe this issue is resolved since Flux v2 went GA. Feel free to file another issue if you see the same behavior again w/Flux v2.

lasomethingsomething avatar Oct 31 '23 15:10 lasomethingsomething

This may have been a problem related to upgrading from tf-controller v0.14 to v0.15, because how the plan ID was calculated changed.

squaremo avatar Oct 31 '23 15:10 squaremo