operator icon indicating copy to clipboard operation
operator copied to clipboard

[BUG] Contour pod is not deleted when disabled by user

Open hoyhbx opened this issue 2 years ago • 1 comments

Describe the bug

We are using Knative-Operator to configure our Knative-Serving instance.

We observed that when we tries to disable contour by setting spec.ingress.contour.enabled to false, the net-contour-controller deployment is neither deleted nor modified.

Expected behavior

We expect that net-contour-controller be deleted when spec.ingress.contour is disabled.

To Reproduce

  1. Deploy Knative-Operator v1.6.0 and create the knative-serving namespace.

  2. Apply the following CustomResource to deploy Knative-Serving with contour enabled.

    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
    metadata:
        name: test-cluster
        namespace: knative-serving
    spec:
        config:
            network:
                ingress-class: kourier.ingress.networking.knative.dev
    ingress:
        contour:
            enabled: true
        kourier:
            enabled: true
    version: 1.6.0
    
  3. Execute kubectl get deployment.apps/net-contour-controller -n knative-serving to observe that the contour controller has been created.

    $ kubectl get deployment.apps/net-contour-controller -n knative-serving 
    NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
    net-contour-controller   1/1     1            1           87s
    
  4. Apply the following CustomResource with contour disabled to update the current Knative-Serving instance

    apiVersion: operator.knative.dev/v1beta1
    kind: KnativeServing
    metadata:
        name: test-cluster
        namespace: knative-serving
    spec:
        config:
            network:
                ingress-class: kourier.ingress.networking.knative.dev
    ingress:
        contour:
            enabled: false
        kourier:
            enabled: true
    version: 1.6.0
    
  5. Repeat step 3 and observe that the contour controller is not deleted.

    $ kubectl get deployment.apps/net-contour-controller -n knative-serving 
    NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
    net-contour-controller   1/1     1            1           3m25s
    

Knative release version

Additional context

  • Kubernetes Version
      Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
      Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.9", GitCommit:"6df4433e288edc9c40c2e344eb336f63fad45cd2", GitTreeState:"clean", BuildDate:"2022-05-19T19:53:08Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
    

Suspected Root Cause

Assuming that the contour controller be deleted when disabled is indeed the expected behavior, we suspect that the bug is caused by only filtering but not deleting the disabled ingress.

During reconciliation, the operator uses function Reconciler.filterDisabledIngresses to filter out the disabled ingress manifests. However, the disabled ingress manifests is only filtered out but not deleted, leading to resource leak.

Suggested Fix

In function Reconciler.filterDisabledIngresses, the operator should check whether the disabled ingresses was installed, if yes, the operator should delete these ingresses from Kubernetes.

hoyhbx avatar Aug 10 '22 04:08 hoyhbx

@hoyhbx This is working as we expected. We did not remove the existing ingress plugins' resources.

houshengbo avatar Aug 11 '22 14:08 houshengbo