tofu-controller
tofu-controller copied to clipboard
SIGSEGV: segmentation violation
2022-06-01T15:12:54.107621723Z panic: runtime error: invalid memory address or nil pointer dereference
2022-06-01T15:12:54.107661697Z [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x15c8039]
2022-06-01T15:12:54.107666863Z
2022-06-01T15:12:54.107671687Z goroutine 232 [running]:
2022-06-01T15:12:54.114675389Z github.com/weaveworks/tf-controller/controllers.(*TerraformReconciler).finalize(_, {_, _}, {{{0x1641244, 0x9}, {0xc000485ba0, 0x20}}, {{0xc000570660, 0x1c}, {0x0, ...}, ...}, ...}, ...)
2022-06-01T15:12:54.114703099Z /workspace/controllers/terraform_controller.go:1455 +0x199
2022-06-01T15:12:54.114709530Z github.com/weaveworks/tf-controller/controllers.(*TerraformReconciler).Reconcile(0xc000528000, {0x1c113a8, 0xc00063f140}, {{{0xc000641d58, 0x18d25c0}, {0xc000570660, 0x30}}})
2022-06-01T15:12:54.114714035Z /workspace/controllers/terraform_controller.go:204 +0x7e6
2022-06-01T15:12:54.115162811Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc0001dc000, {0x1c113a8, 0xc00063f0b0}, {{{0xc000641d58, 0x18d25c0}, {0xc000570660, 0x413c54}}})
2022-06-01T15:12:54.115181858Z /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114 +0x26f
2022-06-01T15:12:54.115187264Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001dc000, {0x1c11300, 0xc000304200}, {0x17f2020, 0xc0002a8080})
2022-06-01T15:12:54.115191724Z /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311 +0x33e
2022-06-01T15:12:54.115206882Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001dc000, {0x1c11300, 0xc000304200})
2022-06-01T15:12:54.115212605Z /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 +0x205
2022-06-01T15:12:54.115218826Z sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
2022-06-01T15:12:54.115225109Z /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227 +0x85
2022-06-01T15:12:54.115230928Z created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
2022-06-01T15:12:54.115236355Z /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:223 +0x357
2022-06-01T15:12:55.084459176Z Stream closed EOF for flux-system/tf-controller-6d7b4d7c55-cfrgz (tf-controller)
Should be latest:
Containers:
tf-controller:
Image: ghcr.io/weaveworks/tf-controller:v0.9.5
Image ID: ghcr.io/weaveworks/tf-controller@sha256:64c4683034ead58d3cd592d11011ff103cd95356e1952c6841233b57886c418e
It seems your GitRepository object got deleted before the finalization process kicked in, is that correct? This is an interesting behavior that needs to think carefully about how to cope with it.
Thank you so much @caspervk
Yes, I am deploying a GitRepository
(and tf-runner RoleBinding
/ServiceAccount
) as part of a helm chart. This issue occurs on helm uninstall.
Relatedly, I am also having issues with helm uninstalling the RoleBinding
and ServiceAccount
in my namespace before the tf-controller can start a pod to destroy the Terraform
resources, causing the Terraform
resources to never be deleted. I'm not sure this is an issue with the tf-controller exactly, as the problem would be solved by setting a custom (un)install order for the helm chart, but this is unfortunately not possible. For now I am solving the issue by deploying the RBAC resources manually (well, through flux, of course) instead of as part of my application's chart.
In my particular case, the problem with the GitRepository
would probably be solved by a solution to our discussion https://github.com/weaveworks/tf-controller/discussions/238, as I would no longer need to deploy a GitRepository
.
Actually, @chanwit, a good solution would maybe be to add some kind of finalizer to the GitRepository
(and perhaps the ServiceAccount
/RoleBinding
)? I envision something not unlike PersistentVolume
, from the docs:
A common example of a finalizer is kubernetes.io/pv-protection, which prevents accidental deletion of PersistentVolume objects. When a PersistentVolume object is in use by a Pod, Kubernetes adds the pv-protection finalizer. If you try to delete the PersistentVolume, it enters a Terminating status, but the controller can't delete it because the finalizer exists. When the Pod stops using the PersistentVolume, Kubernetes clears the pv-protection finalizer, and the controller deletes the volume.
Something similar seems reasonable for the relationship between the Terraform
resources and the GitRepository
s they use.
A GitRepository is often shared among TF objects, Kustomization objects etc.
Unfortunately we cannot delete it in the finalizer.
I'm honestly not too familiar with Kubernetes finalizers, but does a finalizer necessarily have to delete an object, as much as it is a blocker for deletion? My model is that each Terraform
resource would define a finalizer on the GitRepository
that they use, thereby blocking its deletion. I might very well misunderstand.
After consulting the Flux team, we come up with a mechanism similar to what Kustomization Controller is using to deal with this issue.
I'll elaborate more in a PR.
Looking forward to it!