pulumi-kubernetes-operator
pulumi-kubernetes-operator copied to clipboard
can not delete stack correctly when using inline program
What happened?
we are using inline program yaml and stack in the gitops repo, we using it to create network and Kubernetes clusters, it works perfectly, however when we want to do some cleanup, we found we can not delete program and stack at the same time, just wonder should stack uses ownerReferences to block child program deleted immedicately?
here's the log from operator
{"level":"error","ts":"2023-05-03T05:55:44.685Z","logger":"controller_stack","msg":"Failed to update Stack","Request.Namespace":"infra","Request.Name":"vpc-stack","Stack.Name":"organization/my-network/dev","error":"unable to retrieve program for stack","stacktrace":"github.com/pulumi/pulumi-kubernetes-operator/pkg/controller/stack.(*ReconcileStack).Reconcile\n\t/home/runner/work/pulumi-kubernetes-operator/pulumi-kubernetes-operator/pkg/controller/stack/stack_controller.go:544\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}
Expected Behavior
stack should be cleaned up in kubernetes however it can not due to can not find program
Steps to reproduce
cat stack.yaml
apiVersion: pulumi.com/v1alpha1
kind: Stack
metadata:
name: vpc-stack
spec:
backend: gs://my-bucket-00f7aae
stack: organization/my-network/dev
destroyOnFinalize: true
# Drift detection
continueResyncOnCommitMatch: true
resyncFrequencySeconds: 60
refresh: true
programRef:
name: my-network
envRefs:
PULUMI_CONFIG_PASSPHRASE:
type: Literal
literal:
value: ""
config:
gcp:project: my-gcp-project
gcp:region: us-east4
gcp:zone: us-east4
cat program.yaml
apiVersion: pulumi.com/v1
kind: Program
metadata:
name: my-network
program:
resources:
vpcNetwork3:
type: gcp:compute:Network
properties:
autoCreateSubnetworks: false
name: my-network3
vpcNetwork2:
type: gcp:compute:Network
properties:
autoCreateSubnetworks: false
name: my-network2
vpcNetwork:
type: gcp:compute:Network
properties:
autoCreateSubnetworks: false
name: my-network
network-with-private-secondary-ip-ranges:
type: gcp:compute:Subnetwork
properties:
ipCidrRange: 10.2.0.0/16
region: us-central1
network: ${["vpcNetwork"].id}
secondaryIpRanges:
- rangeName: tf-test-secondary-range-update1
ipCidrRange: 192.168.10.0/24
- rangeName: tf-test-secondary-range-update2
ipCidrRange: 172.16.10.0/24
default:
type: gcp:serviceAccount:Account
properties:
accountId: poc-service-account-id
displayName: Pulumi Service Account
primary:
type: gcp:container:Cluster
properties:
location: us-central1
removeDefaultNodePool: true
initialNodeCount: 1
network: my-network
subnetwork: ${["network-with-private-secondary-ip-ranges"].id}
outputs:
id: ${vpcNetwork.id}
endpoint: ${primary.endpoint}
apply both yaml files
kubectl apply -f program.yaml and kubectl apply -f stack.yaml
wait and until it works
start to cleanup and stack and not be deleted.
kubectl delete -f program.yaml && kubectl delete -f stack.yaml
Output of pulumi about
Additional context
No response
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
@elvis-cai Thanks for reporting this issue and apologies for this inconvenience you are facing. This does sound like a bug within how we're handling our finalizers, as the deletion of the Program should fail if it is being managed by a Stack. Thanks for reporting this!
Where can one look to try fix this issue in the code?
This is a really good idea to place a finalizer on the Program to prevent deletion if any Stack has a reference to it. One challenge is that multiple stacks might have a reference, so correctness might be tricky.
As a workaround, I would suggest that it be handled at a higher level, e.g. by using a dependsOn option if you're creating the objects using a Pulumi program.
Hello, I'm facing a similar kind of issue without using a program: my stack is invalid for some reason and the workspace is crashing (that's okay, I'm in a test phase). However, when I want to clean up with kubectl delete, nothing happens: the stack still waits for the workspace to be up : waiting for workspace readiness (even when forcing the deletion)
I think that the operator should be able to detect and release the protection when a stack is being deleted by a user (or an automation).