My terraform resource is unknown and initializing status
I think "detect drifts only mode" requires a terraform plan, otherwise it cannot compare the differences between online and terraform.tfstate file in oss. In the document, "detect drifts only mode" prompts that Terraform plan and apply will not be executed, so how did it perform configuration dirft detects?
I created an application on Flamingo. I set approvePlan: disable because I used it for drift checking. But my "wlx-terraform-share" object has always been in an initializing state. I manually modified the names of resources on the cloud, but it did not detect drift, which made me suspect that there might be a logical problem in this area.
The scenario I want to implement is that I manually modify the resouces on the cloud, Flamingo or tf-controller can detect the configuration drift.
Could you please help me take a look at this issue? I would greatly appreciate it.
TF controller version - v0.15.1 Flamingo - v2.8.4 flux - v2.1.2
apiVersion: infra.contrib.fluxcd.io/v1alpha2
kind: Terraform
metadata:
annotations:
reconcile.fluxcd.io/requestedAt: "2023-11-27T16:56:30.513899852+08:00"
creationTimestamp: "2023-11-27T07:07:21Z"
finalizers:
- finalizers.tf.contrib.fluxcd.io
generation: 4
labels:
kustomize.toolkit.fluxcd.io/name: wlx-terraform-share
kustomize.toolkit.fluxcd.io/namespace: wlx-terraform-share
name: wlx-terraform-share
namespace: wlx-terraform-share
resourceVersion: "1658364114"
uid: e7ecd57a-c808-4049-a964-bf77e4288c63
spec:
alwaysCleanupRunnerPod: true
approvePlan: disable
backendConfig:
customConfiguration: |
backend "oss" {
bucket = "atc-cicd"
prefix = "terraform/wlx-test-share"
key = "./terraform.tfstate"
acl = "private"
region = "cn-beijing"
encrypt = "true"
tablestore_endpoint = https://atc-cicd.cn-beijing.ots.aliyuncs.com/
tablestore_table = "terraform_remote_backend_lock_table_2879cd4b_abfd_567c_48de_e7c4be64bd02"
}
destroyResourcesOnDeletion: false
disableDriftDetection: false
force: false
interval: 5m
parallelism: 0
path: ./terraform
refreshBeforeApply: false
runnerPodTemplate:
spec:
env:
- name: ALICLOUD_ACCESS_KEY
valueFrom:
secretKeyRef:
key: ALICLOUD_ACCESS_KEY
name: alicloud-atc-terraform-id-key
- name: ALICLOUD_SECRET_KEY
valueFrom:
secretKeyRef:
key: ALICLOUD_SECRET_KEY
name: alicloud-atc-terraform-id-key
image: ghcr.io/weaveworks/tf-runner:v0.15.1
runnerTerminationGracePeriodSeconds: 30
serviceAccountName: tf-runner
sourceRef:
kind: GitRepository
name: wlx-terraform-share
storeReadablePlan: human
workspace: default
writeOutputsToSecret:
name: wlx-terraform-outputs
status:
conditions:
- lastTransitionTime: "2023-11-27T08:56:30Z"
message: Initializing
reason: Progressing
status: Unknown
type: Ready
tf-controller log
{"level":"info","ts":"2023-11-27T13:17:28.979Z","msg":"before lookup runner: checking ready condition","controller":"terraform","controllerGroup":"infra.contrib.fluxcd
{"level":"info","ts":"2023-11-27T13:17:28.979Z","msg":"trigger namespace tls secret generation","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","c
{"level":"info","ts":"2023-11-27T13:17:28.979Z","logger":"cert-rotation","msg":"TLS already generated for ","namespace":"wlx-terraform-share"}
{"level":"info","ts":"2023-11-27T13:17:28.979Z","msg":"show runner pod state: ","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"
{"level":"info","ts":"2023-11-27T13:17:44.009Z","msg":"runner is running","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Terraf
{"level":"info","ts":"2023-11-27T13:17:44.009Z","msg":"setting up terraform","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Ter
{"level":"info","ts":"2023-11-27T13:17:44.023Z","msg":"write backend config: ok","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":
{"level":"info","ts":"2023-11-27T13:17:44.023Z","msg":"new terraform","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Terraform"
{"level":"info","ts":"2023-11-27T13:17:44.028Z","msg":"generate vars from tf: ok","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind"
{"level":"info","ts":"2023-11-27T13:17:44.028Z","msg":"generated var files from spec","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerK
{"level":"info","ts":"2023-11-27T13:17:44.028Z","msg":"generate template: ok","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Te
{"level":"info","ts":"2023-11-27T13:17:44.028Z","msg":"generated template","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Terra
{"level":"info","ts":"2023-11-27T13:17:55.413Z","msg":"init reply: ok","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Terraform
{"level":"info","ts":"2023-11-27T13:17:55.413Z","msg":"tfexec initialized terraform","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKi
{"level":"info","ts":"2023-11-27T13:17:55.414Z","msg":"workspace select reply: ok","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind
{"level":"info","ts":"2023-11-27T13:17:55.414Z","msg":"approve plan disabled","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Te
{"level":"info","ts":"2023-11-27T13:17:55.464Z","msg":"clean up dir: ok","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":"Terrafo
{"level":"info","ts":"2023-11-27T13:17:55.474Z","msg":"Reconciliation completed. Generation: 4","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","c
{"level":"info","ts":"2023-11-27T13:17:55.474Z","msg":"requeue after interval","controller":"terraform","controllerGroup":"infra.contrib.fluxcd.io","controllerKind":
I have a very similar behaviour. My environment is pretty similar, GKE v1.24.16-gke.500 + Flux v2.1.0 + tf-controller v0.16.0-rc.3. This was solved when I delete the pod of tf-controller, it seems I should apply this workaround.
Same issue here, GKE 1.28 + Flux 2.2.3 + tofu-controller v0.16.0-rc.4. Rolling the tf-controller pods worked for me to unstick it, but following that I had to manually recover a state lock (I'm using GCS for remote state), my guess is a pod died non-gracefully. I've seen this a couple times in a the past week and we're not in prod just yet, so if I can be helpful on repros let me know.
We have very similar behavior on EKS with tf-controller v0.16.0-rc.4.
It happens when we add a new Terraform CRD in drift-detection-only mode (approvePlan: disable)
The only workaround for us is set approvePlan: "", wait for successful reconcile, and set it back to disable