oci-cloud-controller-manager icon indicating copy to clipboard operation
oci-cloud-controller-manager copied to clipboard

OCI load balancer doesnt receive WorkRequest to be terminated

Open eaglejack85 opened this issue 4 years ago • 1 comments

We create 2 oci load balancers by executing the following lines in one of our helm charts:

service:
  enabled: true
  type: LoadBalancer
  port: 443
  annotations:
    service.beta.kubernetes.io/oci-load-balancer-shape: "${wtss_shape}"
    service.beta.kubernetes.io/oci-load-balancer-security-list-management-mode: "None"
    service.beta.kubernetes.io/oci-load-balancer-backend-protocol: "HTTP"
  sslEnabled: true

and one internal load balancer by executing these lines:

service:
  enabled: true
  type: LoadBalancer
  port: 80
  annotations:
    service.beta.kubernetes.io/oci-load-balancer-internal: "true"
    service.beta.kubernetes.io/oci-load-balancer-shape: "${wtss_shape}"
    service.beta.kubernetes.io/oci-load-balancer-security-list-management-mode: "None"

Environments are provisioned with terraform 0.12.19 and the following set of terraform providers: oci 3.74.0 helm 0.10.2 kubernetes 1.9.0 tls 2.1.1

No issues with creation, all works well in both dev and production OCI tenancy. Problem arises randomly in destruction of environment in production OCI tenancy, where one of the 3 load balancers is not destroyed, causing the load balancer subnet to not being able to be destroyed:

Error: Service error:Conflict. The Subnet ocid1.subnet.oc1.iad.aaaaaaaa6ksuldcwn7i52prronlm7hc2mmt2lwxeskjgpyxkq26zxwmiuexq references the VNIC ocid1.vnic.oc1.iad.abuwcljse6jxsl3fbx5vvgtjrud2xty2bmjmu6uwpt4cd7mkklgbpp77fqwq. You must remove the reference to proceed with this operation.. http status code: 409. Opc request id: 54b539a5d9b2a538a89cb3dccd9ce78b/EFF51FBB2571BC628C5383C03DD97940/57E2CAFEFE75FAD2DBE403D52079245C

This happens randomly only in production OCI tenancy

BUG REPORT

Versions

CCM Version:

Environment:

  • Kubernetes version (use kubectl version): v1.15.7
  • OS (e.g. from /etc/os-release): NAME="Oracle Linux Server" VERSION="7.6" ID="ol" VARIANT="Server" VARIANT_ID="server" VERSION_ID="7.6" PRETTY_NAME="Oracle Linux Server 7.6" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:oracle:linux:7:6:server" HOME_URL="https://linux.oracle.com/" BUG_REPORT_URL="https://bugzilla.oracle.com/"

ORACLE_BUGZILLA_PRODUCT="Oracle Linux 7" ORACLE_BUGZILLA_PRODUCT_VERSION=7.6 ORACLE_SUPPORT_PRODUCT="Oracle Linux" ORACLE_SUPPORT_PRODUCT_VERSION=7.6

  • Kernel (e.g. uname -a): Linux vault01 4.14.35-1844.5.3.el7uek.x86_64 #2 SMP Wed May 8 21:50:52 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Others:

What happened?

One of the OCI load balancers in production tenancy created from oci-cloud-controller randomly doesnt receive work request to be terminated by terraform destroy of the helm release

What you expected to happen?

All OCI load balancers to be consistently terminated

How to reproduce it (as minimally and precisely as possible)?

Create a terraform module to deploy helm release which creates an OCI load balancer by deploying a k8s service by setting annotation service.beta.kubernetes.io/oci-load-balancer-* and try destroying the module from terraform

Anything else we need to know?

eaglejack85 avatar Jun 05 '20 17:06 eaglejack85

It looks more like a terraform issue where subnet is trying to be deleted before the LB gets deleted completely. Can you verify this in newer versions of oci-cloud-controller-manager and confirm if the issue still persists?

mrunalpagnis avatar Aug 03 '22 05:08 mrunalpagnis