pulumi-kubernetes-operator icon indicating copy to clipboard operation
pulumi-kubernetes-operator copied to clipboard

can not delete stack correctly when using inline program

Open elvis-cai opened this issue 2 years ago • 2 comments

What happened?

we are using inline program yaml and stack in the gitops repo, we using it to create network and Kubernetes clusters, it works perfectly, however when we want to do some cleanup, we found we can not delete program and stack at the same time, just wonder should stack uses ownerReferences to block child program deleted immedicately?

here's the log from operator

{"level":"error","ts":"2023-05-03T05:55:44.685Z","logger":"controller_stack","msg":"Failed to update Stack","Request.Namespace":"infra","Request.Name":"vpc-stack","Stack.Name":"organization/my-network/dev","error":"unable to retrieve program for stack","stacktrace":"github.com/pulumi/pulumi-kubernetes-operator/pkg/controller/stack.(*ReconcileStack).Reconcile\n\t/home/runner/work/pulumi-kubernetes-operator/pulumi-kubernetes-operator/pkg/controller/stack/stack_controller.go:544\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214"}

Expected Behavior

stack should be cleaned up in kubernetes however it can not due to can not find program

Steps to reproduce

cat stack.yaml
apiVersion: pulumi.com/v1alpha1
kind: Stack
metadata:
  name: vpc-stack
spec:
  backend: gs://my-bucket-00f7aae
  stack: organization/my-network/dev
  destroyOnFinalize: true
  # Drift detection
  continueResyncOnCommitMatch: true
  resyncFrequencySeconds: 60
  refresh: true
  programRef:
    name: my-network
  envRefs:
    PULUMI_CONFIG_PASSPHRASE:
      type: Literal
      literal:
        value: "" 
  config:
    gcp:project: my-gcp-project
    gcp:region: us-east4
    gcp:zone: us-east4

cat program.yaml
apiVersion: pulumi.com/v1
kind: Program
metadata:
  name: my-network
program:
  resources:
    vpcNetwork3:
      type: gcp:compute:Network
      properties:
        autoCreateSubnetworks: false
        name: my-network3
    vpcNetwork2:
      type: gcp:compute:Network
      properties:
        autoCreateSubnetworks: false
        name: my-network2
    vpcNetwork:
      type: gcp:compute:Network
      properties:
        autoCreateSubnetworks: false
        name: my-network
    network-with-private-secondary-ip-ranges:
      type: gcp:compute:Subnetwork
      properties:
        ipCidrRange: 10.2.0.0/16
        region: us-central1
        network: ${["vpcNetwork"].id}
        secondaryIpRanges:
          - rangeName: tf-test-secondary-range-update1
            ipCidrRange: 192.168.10.0/24
          - rangeName: tf-test-secondary-range-update2
            ipCidrRange: 172.16.10.0/24
    default:
      type: gcp:serviceAccount:Account
      properties:
        accountId: poc-service-account-id
        displayName: Pulumi Service Account
    primary:
      type: gcp:container:Cluster
      properties:
        location: us-central1
        removeDefaultNodePool: true
        initialNodeCount: 1
        network: my-network
        subnetwork: ${["network-with-private-secondary-ip-ranges"].id}
        
  outputs:
    id: ${vpcNetwork.id}
    endpoint: ${primary.endpoint}

apply both yaml files kubectl apply -f program.yaml and kubectl apply -f stack.yaml wait and until it works

start to cleanup and stack and not be deleted. kubectl delete -f program.yaml && kubectl delete -f stack.yaml

Output of pulumi about

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

elvis-cai avatar May 03 '23 07:05 elvis-cai

@elvis-cai Thanks for reporting this issue and apologies for this inconvenience you are facing. This does sound like a bug within how we're handling our finalizers, as the deletion of the Program should fail if it is being managed by a Stack. Thanks for reporting this!

rquitales avatar May 05 '23 18:05 rquitales

Where can one look to try fix this issue in the code?

terekete avatar May 14 '24 01:05 terekete

This is a really good idea to place a finalizer on the Program to prevent deletion if any Stack has a reference to it. One challenge is that multiple stacks might have a reference, so correctness might be tricky.

As a workaround, I would suggest that it be handled at a higher level, e.g. by using a dependsOn option if you're creating the objects using a Pulumi program.

EronWright avatar Oct 29 '24 18:10 EronWright

Hello, I'm facing a similar kind of issue without using a program: my stack is invalid for some reason and the workspace is crashing (that's okay, I'm in a test phase). However, when I want to clean up with kubectl delete, nothing happens: the stack still waits for the workspace to be up : waiting for workspace readiness (even when forcing the deletion)

I think that the operator should be able to detect and release the protection when a stack is being deleted by a user (or an automation).

robinlioret avatar Nov 17 '24 09:11 robinlioret