gatekeeper icon indicating copy to clipboard operation
gatekeeper copied to clipboard

gatekeeper-webhook-service wrong certificate after rotation or manifest re-apply

Open greenu opened this issue 3 years ago • 10 comments

What steps did you take and what happened: When there is time to rotate secret gatekeeper-webhook-server-cert due to invalid, empty or expired certs with refreshCertIfNeeded() calling webhook could result in error

for: "STDIN": Internal error occurred: failed calling webhook "check-ignore-label.gatekeeper.sh": Post https://gatekeeper-webhook-service.gatekeeper-system.svc:443/v1/admitlabel?timeout=3s: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "gatekeeper-ca")

Due to some reason on gatekeeper could get conflict message from k8s api: the object has been modified; please apply your changes to the latest version and try again and then waits ExponentialBackoff until succeeds. All this time (I saw up to 1 minute) webhook-service allows connect to pods with invalid certificate.

What did you expect to happen: readiness probe prevents connection to pods with invalid certificate.

Environment:

  • Gatekeeper version: v3.2.2
  • Kubernetes version: (use kubectl version): 1.18.9 on EKS
  • 3 replicas

greenu avatar Dec 24 '20 09:12 greenu

All this time (I saw up to 1 minute) webhook-service allows connect to pods with invalid certificate.

There is currently a 3 secs timeoutSeconds in the event there is a failure with the webhook. Is the readiness probe to bypass this 3 timeoutSeconds to make it non-blocking?

ritazh avatar Jan 04 '21 19:01 ritazh

is there any progress on this issue? :)

DrackThor avatar Sep 17 '21 09:09 DrackThor

I'm a bit unclear on what the exact issue is.

  • Is the problem that there is a delay in certificate rotation after re-apply of the manifest?
  • Do the contents of gatekeeper-webhook-server-cert secret get wiped on manifest reapply? This should be the only way a cert needs to be re-rotated on manifest-reapply.
  • Is the problem that the liveliness probe is succeeding on a webhook with an expired cert (which is what OP implies).
  • Is the problem that the exponential backoff due to write conflicts is too high?

I guess most salient would be: What is the user-facing problem?

If you do X, what happens and how is that making Gatekeeper harder to use?

There are a couple of ways we can approach this, I think, but knowing the top-level issue will help us figure out which one would be most effective.

maxsmythe avatar Sep 21 '21 03:09 maxsmythe

Hi @maxsmythe , as for me I was able to overcome the issue by updating to the most recent version of the Gatekeeper chart. During my migration to the most recent version (3.6.0), when I encountered the issue, I had a misconfiguration regarding the newly introduced Template version v1 and the old v1beta1 - not sure if this has sth to do with the cause. I also did not get an error when I scaled the replicas to 1. I noticed the error message, stated by @greenu , when I redeployed Gatekeeper and tried to create a new namespace - it failed with said error.

DrackThor avatar Sep 21 '21 05:09 DrackThor

I am having exactly the same issue when using gatekeeper 3.6.0 with helm chart 3.6.0 after upgrading the helm release.

Below is my value.yaml:


replicas: 2
  disableValidatingWebhook: false
  experimentalEnableMutation: true
  logDenies: true
  resourceQuota: false
  postInstall:
    labelNamespace:
      enabled: false
  image:
    repository: openpolicyagent/gatekeeper
    crdRepository: openpolicyagent/gatekeeper-crds
    release: v3.6.0
  secretAnnotations:
    strategy.spinnaker.io/versioned: 'false'
  controllerManager:
    resources:
      limits:
        cpu: 200m
        memory: 512Mi
      requests:
        cpu: 100m
        memory: 256Mi
  audit:
    resources:
      limits:
        cpu: 200m
        memory: 512Mi
      requests:
        cpu: 50m
        memory: 128Mi

Log output from gatekeeper controller:

image

Log output from K8S apiserver:

image

To further check what causes this issue, I did some investigation on the certificate stored in validatingwebhookconfiguration and gatekeeper secret.

The CA certificate stored in validatingwebhookconfiguration is as below:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 0 (0x0)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Oct 27 10:19:06 2021 GMT
            Not After : Oct 25 11:19:06 2031 GMT
        Subject: O = gatekeeper, CN = gatekeeper-ca
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:a0:f1:d9:58:ec:b5:f9:b3:81:d4:aa:94:ab:cb:
                    63:b4:4c:3f:12:91:51:9a:2e:55:cc:e7:72:29:26:
                    63:3c:5f:ce:3d:5c:db:20:cb:4f:89:34:dd:76:4d:
                    4d:df:30:35:c7:f8:86:96:27:ec:d7:6d:16:90:60:
                    1d:09:7c:51:cf:3c:9f:f3:5b:c9:05:1d:68:31:63:
                    78:c7:a1:e7:41:5f:c5:85:af:a8:83:5c:18:be:15:
                    7b:fa:b8:87:60:bc:38:22:5a:f9:a2:a1:73:9b:e4:
                    7f:ed:a9:5b:94:d3:85:53:df:a7:88:78:aa:7a:e7:
                    48:53:f4:94:ce:2f:c7:37:1c:2a:9a:70:6d:7c:0c:
                    69:45:b9:b3:40:f9:3c:1f:7e:3c:e1:44:8a:4a:1a:
                    d1:03:64:7d:fd:6b:80:7f:35:85:5c:8a:bf:e3:71:
                    df:bb:95:69:2c:68:1f:09:b7:86:e6:8b:6c:2f:61:
                    4f:fb:cb:aa:94:fd:af:e3:45:de:3d:15:8d:eb:9b:
                    58:74:11:c6:18:31:b5:58:1f:6c:2a:39:f7:27:a2:
                    c5:2f:5c:fc:0d:08:c9:79:d6:9b:a5:c7:f2:99:81:
                    41:22:60:65:8b:d9:6c:c0:e5:93:e6:8f:09:64:4a:
                    02:76:97:4a:f1:c8:48:cd:67:fe:6f:00:9b:84:3c:
                    41:f3
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment, Certificate Sign
            X509v3 Basic Constraints: critical
                CA:TRUE
            X509v3 Subject Key Identifier: 
                29:07:5D:26:D4:CD:A5:A9:F5:30:93:93:49:1D:7D:EA:99:71:76:D6
            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-ca
    Signature Algorithm: sha256WithRSAEncryption
         7b:97:09:3e:9f:89:d8:05:9a:e7:cd:08:cc:11:d1:c6:ea:75:
         59:c2:2c:a6:4a:64:80:40:d5:b8:7c:bd:74:75:91:7a:99:99:
         41:f6:d7:16:99:bb:14:37:0a:1f:19:ab:b8:0c:f6:35:37:8e:
         16:a4:18:56:ed:40:13:98:2a:4f:57:5f:12:04:cc:89:29:42:
         56:51:88:33:c7:ef:de:e1:c4:15:4a:1c:e5:f0:81:91:60:8f:
         ad:58:ee:fc:9f:b7:b1:b5:2d:b1:51:eb:0e:0d:3e:77:a8:a3:
         e1:29:7b:44:37:64:22:3c:72:c7:93:46:6f:73:a5:db:dd:c8:
         13:b5:91:d0:d9:1a:64:c0:e8:dd:8a:db:70:5b:5c:ec:ce:63:
         b6:21:a6:96:ef:9f:da:b3:86:41:69:aa:a1:f8:51:9d:2b:7c:
         d8:50:b9:84:d9:bf:63:63:90:d6:fc:c9:72:a9:93:56:42:20:
         c4:33:61:8a:e7:4d:d4:2a:3d:76:b1:b4:9e:90:63:9b:17:ad:
         a9:35:4d:55:18:af:30:db:9c:96:b1:16:66:09:55:37:11:18:
         04:ba:af:f7:36:b8:09:a4:2d:16:58:5c:01:aa:bd:dd:4e:31:
         7a:53:95:b7:16:98:39:fa:db:3c:8e:fe:f2:f5:b1:3c:01:d9:
         d8:a2:6f:2d
-----BEGIN CERTIFICATE-----
MIIDMTCCAhmgAwIBAgIBADANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMTAyNzEwMTkwNloX
DTMxMTAyNTExMTkwNlowLTETMBEGA1UEChMKZ2F0ZWtlZXBlcjEWMBQGA1UEAxMN
Z2F0ZWtlZXBlci1jYTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKDx
2VjstfmzgdSqlKvLY7RMPxKRUZouVczncikmYzxfzj1c2yDLT4k03XZNTd8wNcf4
hpYn7NdtFpBgHQl8Uc88n/NbyQUdaDFjeMeh50FfxYWvqINcGL4Ve/q4h2C8OCJa
+aKhc5vkf+2pW5TThVPfp4h4qnrnSFP0lM4vxzccKppwbXwMaUW5s0D5PB9+POFE
ikoa0QNkff1rgH81hVyKv+Nx37uVaSxoHwm3huaLbC9hT/vLqpT9r+NF3j0Vjeub
WHQRxhgxtVgfbCo59yeixS9c/A0IyXnWm6XH8pmBQSJgZYvZbMDlk+aPCWRKAnaX
SvHISM1n/m8Am4Q8QfMCAwEAAaNcMFowDgYDVR0PAQH/BAQDAgKkMA8GA1UdEwEB
/wQFMAMBAf8wHQYDVR0OBBYEFCkHXSbUzaWp9TCTk0kdfeqZcXbWMBgGA1UdEQQR
MA+CDWdhdGVrZWVwZXItY2EwDQYJKoZIhvcNAQELBQADggEBAHuXCT6fidgFmufN
CMwR0cbqdVnCLKZKZIBA1bh8vXR1kXqZmUH21xaZuxQ3Ch8Zq7gM9jU3jhakGFbt
QBOYKk9XXxIEzIkpQlZRiDPH797hxBVKHOXwgZFgj61Y7vyft7G1LbFR6w4NPneo
o+Epe0Q3ZCI8cseTRm9zpdvdyBO1kdDZGmTA6N2K23BbXOzOY7Yhppbvn9qzhkFp
qqH4UZ0rfNhQuYTZv2NjkNb8yXKpk1ZCIMQzYYrnTdQqPXaxtJ6QY5sXrak1TVUY
rzDbnJaxFmYJVTcRGAS6r/c2uAmkLRZYXAGqvd1OMXpTlbcWmDn62zyO/vL1sTwB
2diiby0=
-----END CERTIFICATE-----

The CA certificate stored in gatekeeper-webhook-server-cert:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 0 (0x0)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Oct 27 10:19:06 2021 GMT
            Not After : Oct 25 11:19:06 2031 GMT
        Subject: O = gatekeeper, CN = gatekeeper-ca
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:a0:f1:d9:58:ec:b5:f9:b3:81:d4:aa:94:ab:cb:
                    63:b4:4c:3f:12:91:51:9a:2e:55:cc:e7:72:29:26:
                    63:3c:5f:ce:3d:5c:db:20:cb:4f:89:34:dd:76:4d:
                    4d:df:30:35:c7:f8:86:96:27:ec:d7:6d:16:90:60:
                    1d:09:7c:51:cf:3c:9f:f3:5b:c9:05:1d:68:31:63:
                    78:c7:a1:e7:41:5f:c5:85:af:a8:83:5c:18:be:15:
                    7b:fa:b8:87:60:bc:38:22:5a:f9:a2:a1:73:9b:e4:
                    7f:ed:a9:5b:94:d3:85:53:df:a7:88:78:aa:7a:e7:
                    48:53:f4:94:ce:2f:c7:37:1c:2a:9a:70:6d:7c:0c:
                    69:45:b9:b3:40:f9:3c:1f:7e:3c:e1:44:8a:4a:1a:
                    d1:03:64:7d:fd:6b:80:7f:35:85:5c:8a:bf:e3:71:
                    df:bb:95:69:2c:68:1f:09:b7:86:e6:8b:6c:2f:61:
                    4f:fb:cb:aa:94:fd:af:e3:45:de:3d:15:8d:eb:9b:
                    58:74:11:c6:18:31:b5:58:1f:6c:2a:39:f7:27:a2:
                    c5:2f:5c:fc:0d:08:c9:79:d6:9b:a5:c7:f2:99:81:
                    41:22:60:65:8b:d9:6c:c0:e5:93:e6:8f:09:64:4a:
                    02:76:97:4a:f1:c8:48:cd:67:fe:6f:00:9b:84:3c:
                    41:f3
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment, Certificate Sign
            X509v3 Basic Constraints: critical
                CA:TRUE
            X509v3 Subject Key Identifier: 
                29:07:5D:26:D4:CD:A5:A9:F5:30:93:93:49:1D:7D:EA:99:71:76:D6
            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-ca
    Signature Algorithm: sha256WithRSAEncryption
         7b:97:09:3e:9f:89:d8:05:9a:e7:cd:08:cc:11:d1:c6:ea:75:
         59:c2:2c:a6:4a:64:80:40:d5:b8:7c:bd:74:75:91:7a:99:99:
         41:f6:d7:16:99:bb:14:37:0a:1f:19:ab:b8:0c:f6:35:37:8e:
         16:a4:18:56:ed:40:13:98:2a:4f:57:5f:12:04:cc:89:29:42:
         56:51:88:33:c7:ef:de:e1:c4:15:4a:1c:e5:f0:81:91:60:8f:
         ad:58:ee:fc:9f:b7:b1:b5:2d:b1:51:eb:0e:0d:3e:77:a8:a3:
         e1:29:7b:44:37:64:22:3c:72:c7:93:46:6f:73:a5:db:dd:c8:
         13:b5:91:d0:d9:1a:64:c0:e8:dd:8a:db:70:5b:5c:ec:ce:63:
         b6:21:a6:96:ef:9f:da:b3:86:41:69:aa:a1:f8:51:9d:2b:7c:
         d8:50:b9:84:d9:bf:63:63:90:d6:fc:c9:72:a9:93:56:42:20:
         c4:33:61:8a:e7:4d:d4:2a:3d:76:b1:b4:9e:90:63:9b:17:ad:
         a9:35:4d:55:18:af:30:db:9c:96:b1:16:66:09:55:37:11:18:
         04:ba:af:f7:36:b8:09:a4:2d:16:58:5c:01:aa:bd:dd:4e:31:
         7a:53:95:b7:16:98:39:fa:db:3c:8e:fe:f2:f5:b1:3c:01:d9:
         d8:a2:6f:2d
-----BEGIN CERTIFICATE-----
MIIDMTCCAhmgAwIBAgIBADANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMTAyNzEwMTkwNloX
DTMxMTAyNTExMTkwNlowLTETMBEGA1UEChMKZ2F0ZWtlZXBlcjEWMBQGA1UEAxMN
Z2F0ZWtlZXBlci1jYTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAKDx
2VjstfmzgdSqlKvLY7RMPxKRUZouVczncikmYzxfzj1c2yDLT4k03XZNTd8wNcf4
hpYn7NdtFpBgHQl8Uc88n/NbyQUdaDFjeMeh50FfxYWvqINcGL4Ve/q4h2C8OCJa
+aKhc5vkf+2pW5TThVPfp4h4qnrnSFP0lM4vxzccKppwbXwMaUW5s0D5PB9+POFE
ikoa0QNkff1rgH81hVyKv+Nx37uVaSxoHwm3huaLbC9hT/vLqpT9r+NF3j0Vjeub
WHQRxhgxtVgfbCo59yeixS9c/A0IyXnWm6XH8pmBQSJgZYvZbMDlk+aPCWRKAnaX
SvHISM1n/m8Am4Q8QfMCAwEAAaNcMFowDgYDVR0PAQH/BAQDAgKkMA8GA1UdEwEB
/wQFMAMBAf8wHQYDVR0OBBYEFCkHXSbUzaWp9TCTk0kdfeqZcXbWMBgGA1UdEQQR
MA+CDWdhdGVrZWVwZXItY2EwDQYJKoZIhvcNAQELBQADggEBAHuXCT6fidgFmufN
CMwR0cbqdVnCLKZKZIBA1bh8vXR1kXqZmUH21xaZuxQ3Ch8Zq7gM9jU3jhakGFbt
QBOYKk9XXxIEzIkpQlZRiDPH797hxBVKHOXwgZFgj61Y7vyft7G1LbFR6w4NPneo
o+Epe0Q3ZCI8cseTRm9zpdvdyBO1kdDZGmTA6N2K23BbXOzOY7Yhppbvn9qzhkFp
qqH4UZ0rfNhQuYTZv2NjkNb8yXKpk1ZCIMQzYYrnTdQqPXaxtJ6QY5sXrak1TVUY
rzDbnJaxFmYJVTcRGAS6r/c2uAmkLRZYXAGqvd1OMXpTlbcWmDn62zyO/vL1sTwB
2diiby0=
-----END CERTIFICATE-----

The server certificate stored in gatekeeper-webhook-server-cert:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Oct 27 10:19:06 2021 GMT
            Not After : Oct 25 11:19:06 2031 GMT
        Subject: CN = gatekeeper-webhook-service.gatekeeper.svc
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:c2:d0:99:22:0b:01:c4:4b:6f:3b:2c:b0:55:9b:
                    b8:27:c3:94:c9:0c:fc:a5:b4:a8:3a:28:99:d9:26:
                    3c:22:f9:a4:86:3d:f7:46:95:7c:fe:5d:f8:62:78:
                    cb:b1:ff:ba:fb:cb:80:81:48:97:98:8b:24:db:52:
                    45:32:db:48:5e:de:a2:53:01:62:49:7d:d8:ea:17:
                    f7:c3:c9:50:f4:92:04:fa:ab:50:d0:ec:ed:b5:68:
                    38:a4:6b:7f:1e:dd:8c:26:80:b2:c3:f9:d1:6d:fd:
                    b6:ca:03:22:27:7f:bd:f7:e0:62:68:6f:37:d8:c1:
                    99:04:b2:ee:66:1c:d9:37:13:a6:ae:c3:24:5b:3c:
                    1f:d3:a0:50:ee:cb:84:32:15:44:02:2b:7b:6e:ac:
                    ba:92:07:45:89:c2:c5:6b:a2:a8:b0:10:f7:20:5e:
                    da:b4:f9:70:ab:00:86:10:24:d8:ab:d7:e9:e5:46:
                    14:2b:71:48:54:ce:8a:6d:60:a4:a2:a8:b9:14:d5:
                    4f:dc:01:af:89:91:f8:46:ff:a8:4b:06:cd:68:a4:
                    2a:0e:2a:a0:73:06:66:25:ee:27:85:4b:cf:96:a3:
                    82:77:ad:22:f0:5d:d4:c2:6f:2e:38:55:4d:e5:fa:
                    45:70:1c:6c:a8:a0:e2:05:d1:77:0f:62:ba:4b:fe:
                    20:a5
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Authority Key Identifier: 
                keyid:29:07:5D:26:D4:CD:A5:A9:F5:30:93:93:49:1D:7D:EA:99:71:76:D6

            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-webhook-service.gatekeeper.svc
    Signature Algorithm: sha256WithRSAEncryption
         9e:8c:da:c8:e4:3b:1b:98:28:af:73:e8:da:2b:b1:cb:81:1f:
         4b:f1:8d:fb:cb:cf:a4:3b:83:6d:8f:90:a1:ad:31:b8:e4:d7:
         00:80:91:a1:23:8e:38:91:44:9c:21:7e:b1:3e:6e:aa:9d:f8:
         a6:d6:86:e0:dd:d2:b9:18:34:2d:8b:92:12:5d:75:d2:03:b9:
         1c:ba:3a:a9:d3:c1:d9:f3:cc:55:99:a8:23:3b:bd:77:4c:e0:
         9a:65:d2:7d:9a:05:14:49:4f:a6:7f:c2:ff:fa:85:f6:75:a6:
         65:9d:21:8b:e5:d6:e4:31:95:5c:4a:67:98:3f:59:bf:f0:ff:
         12:7d:ae:15:32:c8:cb:25:d8:27:43:b5:b5:1e:79:e2:34:df:
         59:b7:83:15:c2:38:c9:31:28:d6:c0:3e:18:90:ab:eb:8f:94:
         16:52:fc:12:d7:ca:45:8e:4b:59:c0:1c:8e:67:34:91:1a:ac:
         8b:4c:0c:f1:29:cf:3f:28:54:74:c2:d5:d7:1b:b0:6c:e0:d2:
         b5:38:87:52:79:8c:15:8d:0c:13:73:f2:53:38:ec:c4:46:48:
         5a:07:d1:3f:0e:ce:a4:33:f5:c3:1d:25:f4:04:b3:a4:9d:b7:
         cc:6b:24:15:d2:7f:95:12:95:59:13:08:fc:0a:ba:c0:19:8e:
         a7:8b:56:f3
-----BEGIN CERTIFICATE-----
MIIDajCCAlKgAwIBAgIBATANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMTAyNzEwMTkwNloX
DTMxMTAyNTExMTkwNlowNDEyMDAGA1UEAxMpZ2F0ZWtlZXBlci13ZWJob29rLXNl
cnZpY2UuZ2F0ZWtlZXBlci5zdmMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQDC0JkiCwHES287LLBVm7gnw5TJDPyltKg6KJnZJjwi+aSGPfdGlXz+Xfhi
eMux/7r7y4CBSJeYiyTbUkUy20he3qJTAWJJfdjqF/fDyVD0kgT6q1DQ7O21aDik
a38e3YwmgLLD+dFt/bbKAyInf7334GJobzfYwZkEsu5mHNk3E6auwyRbPB/ToFDu
y4QyFUQCK3turLqSB0WJwsVroqiwEPcgXtq0+XCrAIYQJNir1+nlRhQrcUhUzopt
YKSiqLkU1U/cAa+JkfhG/6hLBs1opCoOKqBzBmYl7ieFS8+Wo4J3rSLwXdTCby44
VU3l+kVwHGyooOIF0XcPYrpL/iClAgMBAAGjgY0wgYowDgYDVR0PAQH/BAQDAgWg
MBMGA1UdJQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAU
KQddJtTNpan1MJOTSR196plxdtYwNAYDVR0RBC0wK4IpZ2F0ZWtlZXBlci13ZWJo
b29rLXNlcnZpY2UuZ2F0ZWtlZXBlci5zdmMwDQYJKoZIhvcNAQELBQADggEBAJ6M
2sjkOxuYKK9z6NorscuBH0vxjfvLz6Q7g22PkKGtMbjk1wCAkaEjjjiRRJwhfrE+
bqqd+KbWhuDd0rkYNC2LkhJdddIDuRy6OqnTwdnzzFWZqCM7vXdM4Jpl0n2aBRRJ
T6Z/wv/6hfZ1pmWdIYvl1uQxlVxKZ5g/Wb/w/xJ9rhUyyMsl2CdDtbUeeeI031m3
gxXCOMkxKNbAPhiQq+uPlBZS/BLXykWOS1nAHI5nNJEarItMDPEpzz8oVHTC1dcb
sGzg0rU4h1J5jBWNDBNz8lM47MRGSFoH0T8OzqQz9cMdJfQEs6Sdt8xrJBXSf5US
lVkTCPwKusAZjqeLVvM=
-----END CERTIFICATE-----

The actual certificate served by gatekeeper controller:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Sep 14 08:24:38 2021 GMT
            Not After : Sep 12 09:24:38 2031 GMT
        Subject: CN = gatekeeper-webhook-service.gatekeeper.svc
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:d7:33:7c:d5:e5:f7:16:aa:46:18:9d:df:06:fc:
                    52:ec:58:49:ea:aa:36:29:de:8e:dd:58:04:b0:f1:
                    cf:7f:f2:50:c0:db:6a:f8:95:fd:c4:8d:ea:5e:e8:
                    e4:af:99:bc:97:1f:97:c1:e0:af:54:bf:96:19:a9:
                    96:bd:ac:68:a9:78:97:cd:18:37:c2:d8:22:ef:79:
                    2b:52:36:e1:f4:5d:56:14:45:04:0d:00:12:03:42:
                    e6:10:86:3a:30:0b:f4:9b:83:85:a5:5f:e7:86:0a:
                    fd:86:fd:f7:92:ed:75:f8:a0:4b:a3:83:30:7e:62:
                    54:02:0e:57:9c:80:79:75:72:1e:ed:0b:3a:d2:64:
                    81:a3:96:e7:90:ae:03:aa:2a:37:46:8c:05:14:89:
                    f8:ef:10:a1:93:87:08:8f:a2:ff:59:f2:7c:92:8e:
                    c4:3f:8d:0b:a2:dd:95:14:53:0c:21:ae:bb:1a:19:
                    4f:98:f0:6a:83:80:fa:da:c2:8f:b1:ec:be:8d:85:
                    ee:b0:e6:2d:f3:ee:36:4d:75:05:e8:44:e4:d9:b1:
                    80:69:bb:6c:10:d6:eb:8f:7c:9a:f7:58:26:60:31:
                    ee:3d:e6:47:fa:d0:6c:d8:10:64:db:f5:12:32:11:
                    2f:0c:a2:8f:e1:fa:16:30:68:2b:3a:92:57:f0:45:
                    97:91
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Authority Key Identifier: 
                keyid:E9:BD:70:69:AD:D4:01:55:70:E9:7A:86:49:9D:2A:D3:0E:2A:4D:0A

            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-webhook-service.gatekeeper.svc
    Signature Algorithm: sha256WithRSAEncryption
         2e:cd:8c:a2:fc:dc:5e:2a:37:12:53:12:4e:6f:24:d2:4f:02:
         32:d4:39:5b:0b:9f:bb:af:20:b3:bd:d7:87:00:71:52:37:a7:
         ab:ae:82:26:71:dc:9e:18:9c:64:45:a3:b3:e9:35:f7:2f:35:
         c6:3f:91:a6:5f:9b:8f:19:df:b2:81:da:e4:32:05:f6:7f:c7:
         28:ef:b1:00:b9:c2:a1:31:32:2c:e3:6e:d0:b0:bf:27:0d:78:
         9c:ee:27:c2:51:24:d6:33:a3:c5:ef:8b:d1:f9:8c:48:64:21:
         16:df:df:97:ce:74:56:a8:dc:ed:c2:63:c8:c9:e1:a4:70:b9:
         24:65:4f:1f:51:66:18:19:d7:09:eb:97:b9:52:28:05:eb:9e:
         1c:58:ba:9b:dd:be:f1:d6:88:61:fb:ea:2c:c9:0d:76:3d:90:
         4c:06:d6:eb:88:12:23:78:10:32:9d:63:c9:11:32:07:2b:95:
         bd:85:36:88:ab:3c:1c:51:59:f3:f6:70:0f:fd:76:5c:5b:a9:
         3f:20:03:59:a6:a8:3f:91:11:1a:db:1d:5d:85:78:58:de:8d:
         67:f9:46:ec:4b:c9:b9:57:93:ea:97:5a:64:24:be:10:a9:ae:
         0e:b8:5a:74:7f:47:75:f3:c5:df:77:13:a0:52:a1:8a:28:61:
         65:23:30:51
-----BEGIN CERTIFICATE-----
MIIDajCCAlKgAwIBAgIBATANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMDkxNDA4MjQzOFoX
DTMxMDkxMjA5MjQzOFowNDEyMDAGA1UEAxMpZ2F0ZWtlZXBlci13ZWJob29rLXNl
cnZpY2UuZ2F0ZWtlZXBlci5zdmMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQDXM3zV5fcWqkYYnd8G/FLsWEnqqjYp3o7dWASw8c9/8lDA22r4lf3Ejepe
6OSvmbyXH5fB4K9Uv5YZqZa9rGipeJfNGDfC2CLveStSNuH0XVYURQQNABIDQuYQ
hjowC/Sbg4WlX+eGCv2G/feS7XX4oEujgzB+YlQCDlecgHl1ch7tCzrSZIGjlueQ
rgOqKjdGjAUUifjvEKGThwiPov9Z8nySjsQ/jQui3ZUUUwwhrrsaGU+Y8GqDgPra
wo+x7L6Nhe6w5i3z7jZNdQXoROTZsYBpu2wQ1uuPfJr3WCZgMe495kf60GzYEGTb
9RIyES8Moo/h+hYwaCs6klfwRZeRAgMBAAGjgY0wgYowDgYDVR0PAQH/BAQDAgWg
MBMGA1UdJQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAU
6b1waa3UAVVw6XqGSZ0q0w4qTQowNAYDVR0RBC0wK4IpZ2F0ZWtlZXBlci13ZWJo
b29rLXNlcnZpY2UuZ2F0ZWtlZXBlci5zdmMwDQYJKoZIhvcNAQELBQADggEBAC7N
jKL83F4qNxJTEk5vJNJPAjLUOVsLn7uvILO914cAcVI3p6uugiZx3J4YnGRFo7Pp
NfcvNcY/kaZfm48Z37KB2uQyBfZ/xyjvsQC5wqExMizjbtCwvycNeJzuJ8JRJNYz
o8Xvi9H5jEhkIRbf35fOdFao3O3CY8jJ4aRwuSRlTx9RZhgZ1wnrl7lSKAXrnhxY
upvdvvHWiGH76izJDXY9kEwG1uuIEiN4EDKdY8kRMgcrlb2FNoirPBxRWfP2cA/9
dlxbqT8gA1mmqD+RERrbHV2FeFjejWf5RuxLyblXk+qXWmQkvhCprg64WnR/R3Xz
xd93E6BSoYooYWUjMFE=
-----END CERTIFICATE-----

It indicates that the CA stored in validatingwebhookconfiguration and the secret gatekeeper-webhook-server-cert are the same, while the server certificate served by the gatekeeper controller and the server certificate stored in the secret gatekeeper-webhook-server-cert are different.

I also found that all gatekeeper controller pods were started before the later certificates were generated. Seems they never reload the new certificates into memory and serve in webhook request, hence caused the issue.

ethernoy avatar Nov 15 '21 09:11 ethernoy

I just found another cluster in my environment is having the same issue for days. Below are the detailed informations:

In this environment, there are two gatekeeper controller manager pods. The values.yaml are the same as above (i.e. 2 replicas, using internal cert-rotation). These two pods had been running normally for 24 hrs since start-up, then they ran into the bad certificate issue for 5 days till now. Logs attached (note the logs are sorted in reverse order):

gatekeeper-controller-manager-7cb847b4f5-qf65k:

image image

gatekeeper-controller-manager-7cb847b4f5-s6s87:

image image

To me, it seems that this is some kind of race condition between two gatekeeper controller manager pods. I collected the certificates served by two pods, stored in secret and validatingwebhookconfiguration respectively:

validatingwebhookconfiguration (updated at "2021-11-10T11:57:53Z")

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 0 (0x0)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Nov 11 10:57:47 2021 GMT
            Not After : Nov  9 11:57:47 2031 GMT
        Subject: O = gatekeeper, CN = gatekeeper-ca
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:e5:db:8a:6f:3a:6f:40:1d:b5:34:71:30:a7:94:
                    8f:12:05:18:de:ea:76:b4:34:81:a1:58:4e:38:dd:
                    70:53:df:31:97:20:71:24:e0:e7:1b:0c:a5:03:1f:
                    4a:27:26:a3:6d:80:1e:73:b5:5a:88:d8:64:89:fa:
                    f9:7c:44:f3:36:da:86:07:86:63:85:b7:2e:bc:54:
                    c3:e4:d9:20:bb:0e:d6:29:11:c2:19:66:58:a0:4c:
                    5c:4e:f7:d6:f7:81:a9:76:d5:4f:2f:64:44:63:85:
                    0a:69:c0:76:ae:71:52:c0:56:0c:7e:90:d1:d4:71:
                    b5:b4:c6:44:d3:d8:11:81:cd:56:ff:cf:63:48:3b:
                    d1:87:f7:30:71:3e:c0:da:d0:d2:36:2a:ec:3d:ed:
                    33:f1:94:aa:5e:f6:c2:c2:c5:80:28:cd:cc:d5:ce:
                    a1:54:23:5f:d2:38:af:68:10:65:95:23:75:73:53:
                    a9:cf:04:3a:c7:d0:3e:2b:da:ae:4e:8c:b7:01:d7:
                    b4:5f:b3:4d:1b:00:00:e4:e6:46:9d:52:1e:a5:25:
                    8c:3c:b0:03:2b:5d:b2:6b:3e:ed:3b:23:26:8b:09:
                    37:63:72:6e:52:0e:8a:fa:7b:2e:6c:62:c8:f8:5b:
                    1f:43:bf:95:04:31:50:da:d0:1f:40:1b:33:e4:4a:
                    6d:8b
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment, Certificate Sign
            X509v3 Basic Constraints: critical
                CA:TRUE
            X509v3 Subject Key Identifier: 
                32:4F:5C:30:07:C9:67:7C:4A:1A:BE:9F:BC:2B:A3:BE:27:40:64:23
            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-ca
    Signature Algorithm: sha256WithRSAEncryption
         41:04:de:a2:be:7c:7c:40:20:ad:a1:d1:e6:9b:5b:ae:92:4b:
         e2:1e:e8:65:21:26:ab:88:39:e2:d7:30:a1:61:46:97:05:b7:
         bf:74:7d:02:c8:a5:19:60:53:fe:15:1b:b8:42:02:1e:b9:07:
         d5:22:bf:13:ec:00:e7:d6:c4:e9:ff:65:0e:9e:88:93:ff:40:
         0d:d3:5b:7f:b3:ca:c4:37:63:a9:0d:71:52:a7:35:0f:e8:cc:
         2b:b5:c0:ba:48:76:b1:95:8c:34:8f:ed:28:f9:29:06:d9:e2:
         25:0e:b4:1e:09:4a:fc:8f:73:1c:1d:f1:14:ca:c3:c0:d2:5d:
         0d:46:79:af:dd:ad:ff:04:2b:0f:81:0a:9f:b5:84:9d:05:47:
         38:fe:bd:24:a6:f8:48:db:ca:dd:a0:2d:32:88:04:81:b0:c4:
         d2:ac:e7:5e:9d:ee:b3:36:03:7a:55:a0:0e:16:6d:74:17:0e:
         6d:75:06:79:61:7e:f4:b0:9c:b7:58:a1:f3:4e:ad:69:26:ad:
         e0:9c:b4:02:9a:1c:a0:2c:82:b7:08:78:64:03:71:91:9c:72:
         c9:5c:2d:b9:9d:8d:36:18:51:b1:54:d4:7a:07:4b:15:b8:c5:
         4b:f1:6f:04:26:67:22:5c:67:6f:4d:82:57:9a:27:dd:7c:82:
         54:d5:e6:19
-----BEGIN CERTIFICATE-----
MIIDMTCCAhmgAwIBAgIBADANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMTExMTEwNTc0N1oX
DTMxMTEwOTExNTc0N1owLTETMBEGA1UEChMKZ2F0ZWtlZXBlcjEWMBQGA1UEAxMN
Z2F0ZWtlZXBlci1jYTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAOXb
im86b0AdtTRxMKeUjxIFGN7qdrQ0gaFYTjjdcFPfMZcgcSTg5xsMpQMfSicmo22A
HnO1WojYZIn6+XxE8zbahgeGY4W3LrxUw+TZILsO1ikRwhlmWKBMXE731veBqXbV
Ty9kRGOFCmnAdq5xUsBWDH6Q0dRxtbTGRNPYEYHNVv/PY0g70Yf3MHE+wNrQ0jYq
7D3tM/GUql72wsLFgCjNzNXOoVQjX9I4r2gQZZUjdXNTqc8EOsfQPivark6MtwHX
tF+zTRsAAOTmRp1SHqUljDywAytdsms+7TsjJosJN2NyblIOivp7LmxiyPhbH0O/
lQQxUNrQH0AbM+RKbYsCAwEAAaNcMFowDgYDVR0PAQH/BAQDAgKkMA8GA1UdEwEB
/wQFMAMBAf8wHQYDVR0OBBYEFDJPXDAHyWd8Shq+n7wro74nQGQjMBgGA1UdEQQR
MA+CDWdhdGVrZWVwZXItY2EwDQYJKoZIhvcNAQELBQADggEBAEEE3qK+fHxAIK2h
0eabW66SS+Ie6GUhJquIOeLXMKFhRpcFt790fQLIpRlgU/4VG7hCAh65B9UivxPs
AOfWxOn/ZQ6eiJP/QA3TW3+zysQ3Y6kNcVKnNQ/ozCu1wLpIdrGVjDSP7Sj5KQbZ
4iUOtB4JSvyPcxwd8RTKw8DSXQ1Gea/drf8EKw+BCp+1hJ0FRzj+vSSm+Ejbyt2g
LTKIBIGwxNKs516d7rM2A3pVoA4WbXQXDm11BnlhfvSwnLdYofNOrWkmreCctAKa
HKAsgrcIeGQDcZGccslcLbmdjTYYUbFU1HoHSxW4xUvxbwQmZyJcZ29NgleaJ918
glTV5hk=
-----END CERTIFICATE-----

gatekeeper-webhook-server-cert (updated at "2021-11-11T11:57:47Z")

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 0 (0x0)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Nov 11 10:57:47 2021 GMT
            Not After : Nov  9 11:57:47 2031 GMT
        Subject: O = gatekeeper, CN = gatekeeper-ca
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:e5:db:8a:6f:3a:6f:40:1d:b5:34:71:30:a7:94:
                    8f:12:05:18:de:ea:76:b4:34:81:a1:58:4e:38:dd:
                    70:53:df:31:97:20:71:24:e0:e7:1b:0c:a5:03:1f:
                    4a:27:26:a3:6d:80:1e:73:b5:5a:88:d8:64:89:fa:
                    f9:7c:44:f3:36:da:86:07:86:63:85:b7:2e:bc:54:
                    c3:e4:d9:20:bb:0e:d6:29:11:c2:19:66:58:a0:4c:
                    5c:4e:f7:d6:f7:81:a9:76:d5:4f:2f:64:44:63:85:
                    0a:69:c0:76:ae:71:52:c0:56:0c:7e:90:d1:d4:71:
                    b5:b4:c6:44:d3:d8:11:81:cd:56:ff:cf:63:48:3b:
                    d1:87:f7:30:71:3e:c0:da:d0:d2:36:2a:ec:3d:ed:
                    33:f1:94:aa:5e:f6:c2:c2:c5:80:28:cd:cc:d5:ce:
                    a1:54:23:5f:d2:38:af:68:10:65:95:23:75:73:53:
                    a9:cf:04:3a:c7:d0:3e:2b:da:ae:4e:8c:b7:01:d7:
                    b4:5f:b3:4d:1b:00:00:e4:e6:46:9d:52:1e:a5:25:
                    8c:3c:b0:03:2b:5d:b2:6b:3e:ed:3b:23:26:8b:09:
                    37:63:72:6e:52:0e:8a:fa:7b:2e:6c:62:c8:f8:5b:
                    1f:43:bf:95:04:31:50:da:d0:1f:40:1b:33:e4:4a:
                    6d:8b
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment, Certificate Sign
            X509v3 Basic Constraints: critical
                CA:TRUE
            X509v3 Subject Key Identifier: 
                32:4F:5C:30:07:C9:67:7C:4A:1A:BE:9F:BC:2B:A3:BE:27:40:64:23
            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-ca
    Signature Algorithm: sha256WithRSAEncryption
         41:04:de:a2:be:7c:7c:40:20:ad:a1:d1:e6:9b:5b:ae:92:4b:
         e2:1e:e8:65:21:26:ab:88:39:e2:d7:30:a1:61:46:97:05:b7:
         bf:74:7d:02:c8:a5:19:60:53:fe:15:1b:b8:42:02:1e:b9:07:
         d5:22:bf:13:ec:00:e7:d6:c4:e9:ff:65:0e:9e:88:93:ff:40:
         0d:d3:5b:7f:b3:ca:c4:37:63:a9:0d:71:52:a7:35:0f:e8:cc:
         2b:b5:c0:ba:48:76:b1:95:8c:34:8f:ed:28:f9:29:06:d9:e2:
         25:0e:b4:1e:09:4a:fc:8f:73:1c:1d:f1:14:ca:c3:c0:d2:5d:
         0d:46:79:af:dd:ad:ff:04:2b:0f:81:0a:9f:b5:84:9d:05:47:
         38:fe:bd:24:a6:f8:48:db:ca:dd:a0:2d:32:88:04:81:b0:c4:
         d2:ac:e7:5e:9d:ee:b3:36:03:7a:55:a0:0e:16:6d:74:17:0e:
         6d:75:06:79:61:7e:f4:b0:9c:b7:58:a1:f3:4e:ad:69:26:ad:
         e0:9c:b4:02:9a:1c:a0:2c:82:b7:08:78:64:03:71:91:9c:72:
         c9:5c:2d:b9:9d:8d:36:18:51:b1:54:d4:7a:07:4b:15:b8:c5:
         4b:f1:6f:04:26:67:22:5c:67:6f:4d:82:57:9a:27:dd:7c:82:
         54:d5:e6:19
-----BEGIN CERTIFICATE-----
MIIDMTCCAhmgAwIBAgIBADANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMTExMTEwNTc0N1oX
DTMxMTEwOTExNTc0N1owLTETMBEGA1UEChMKZ2F0ZWtlZXBlcjEWMBQGA1UEAxMN
Z2F0ZWtlZXBlci1jYTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAOXb
im86b0AdtTRxMKeUjxIFGN7qdrQ0gaFYTjjdcFPfMZcgcSTg5xsMpQMfSicmo22A
HnO1WojYZIn6+XxE8zbahgeGY4W3LrxUw+TZILsO1ikRwhlmWKBMXE731veBqXbV
Ty9kRGOFCmnAdq5xUsBWDH6Q0dRxtbTGRNPYEYHNVv/PY0g70Yf3MHE+wNrQ0jYq
7D3tM/GUql72wsLFgCjNzNXOoVQjX9I4r2gQZZUjdXNTqc8EOsfQPivark6MtwHX
tF+zTRsAAOTmRp1SHqUljDywAytdsms+7TsjJosJN2NyblIOivp7LmxiyPhbH0O/
lQQxUNrQH0AbM+RKbYsCAwEAAaNcMFowDgYDVR0PAQH/BAQDAgKkMA8GA1UdEwEB
/wQFMAMBAf8wHQYDVR0OBBYEFDJPXDAHyWd8Shq+n7wro74nQGQjMBgGA1UdEQQR
MA+CDWdhdGVrZWVwZXItY2EwDQYJKoZIhvcNAQELBQADggEBAEEE3qK+fHxAIK2h
0eabW66SS+Ie6GUhJquIOeLXMKFhRpcFt790fQLIpRlgU/4VG7hCAh65B9UivxPs
AOfWxOn/ZQ6eiJP/QA3TW3+zysQ3Y6kNcVKnNQ/ozCu1wLpIdrGVjDSP7Sj5KQbZ
4iUOtB4JSvyPcxwd8RTKw8DSXQ1Gea/drf8EKw+BCp+1hJ0FRzj+vSSm+Ejbyt2g
LTKIBIGwxNKs516d7rM2A3pVoA4WbXQXDm11BnlhfvSwnLdYofNOrWkmreCctAKa
HKAsgrcIeGQDcZGccslcLbmdjTYYUbFU1HoHSxW4xUvxbwQmZyJcZ29NgleaJ918
glTV5hk=
-----END CERTIFICATE-----

served by gatekeeper-controller-manager-7cb847b4f5-qf65k

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Sep 14 08:24:38 2021 GMT
            Not After : Sep 12 09:24:38 2031 GMT
        Subject: CN = gatekeeper-webhook-service.gatekeeper.svc
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:d7:33:7c:d5:e5:f7:16:aa:46:18:9d:df:06:fc:
                    52:ec:58:49:ea:aa:36:29:de:8e:dd:58:04:b0:f1:
                    cf:7f:f2:50:c0:db:6a:f8:95:fd:c4:8d:ea:5e:e8:
                    e4:af:99:bc:97:1f:97:c1:e0:af:54:bf:96:19:a9:
                    96:bd:ac:68:a9:78:97:cd:18:37:c2:d8:22:ef:79:
                    2b:52:36:e1:f4:5d:56:14:45:04:0d:00:12:03:42:
                    e6:10:86:3a:30:0b:f4:9b:83:85:a5:5f:e7:86:0a:
                    fd:86:fd:f7:92:ed:75:f8:a0:4b:a3:83:30:7e:62:
                    54:02:0e:57:9c:80:79:75:72:1e:ed:0b:3a:d2:64:
                    81:a3:96:e7:90:ae:03:aa:2a:37:46:8c:05:14:89:
                    f8:ef:10:a1:93:87:08:8f:a2:ff:59:f2:7c:92:8e:
                    c4:3f:8d:0b:a2:dd:95:14:53:0c:21:ae:bb:1a:19:
                    4f:98:f0:6a:83:80:fa:da:c2:8f:b1:ec:be:8d:85:
                    ee:b0:e6:2d:f3:ee:36:4d:75:05:e8:44:e4:d9:b1:
                    80:69:bb:6c:10:d6:eb:8f:7c:9a:f7:58:26:60:31:
                    ee:3d:e6:47:fa:d0:6c:d8:10:64:db:f5:12:32:11:
                    2f:0c:a2:8f:e1:fa:16:30:68:2b:3a:92:57:f0:45:
                    97:91
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Authority Key Identifier: 
                keyid:E9:BD:70:69:AD:D4:01:55:70:E9:7A:86:49:9D:2A:D3:0E:2A:4D:0A

            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-webhook-service.gatekeeper.svc
    Signature Algorithm: sha256WithRSAEncryption
         2e:cd:8c:a2:fc:dc:5e:2a:37:12:53:12:4e:6f:24:d2:4f:02:
         32:d4:39:5b:0b:9f:bb:af:20:b3:bd:d7:87:00:71:52:37:a7:
         ab:ae:82:26:71:dc:9e:18:9c:64:45:a3:b3:e9:35:f7:2f:35:
         c6:3f:91:a6:5f:9b:8f:19:df:b2:81:da:e4:32:05:f6:7f:c7:
         28:ef:b1:00:b9:c2:a1:31:32:2c:e3:6e:d0:b0:bf:27:0d:78:
         9c:ee:27:c2:51:24:d6:33:a3:c5:ef:8b:d1:f9:8c:48:64:21:
         16:df:df:97:ce:74:56:a8:dc:ed:c2:63:c8:c9:e1:a4:70:b9:
         24:65:4f:1f:51:66:18:19:d7:09:eb:97:b9:52:28:05:eb:9e:
         1c:58:ba:9b:dd:be:f1:d6:88:61:fb:ea:2c:c9:0d:76:3d:90:
         4c:06:d6:eb:88:12:23:78:10:32:9d:63:c9:11:32:07:2b:95:
         bd:85:36:88:ab:3c:1c:51:59:f3:f6:70:0f:fd:76:5c:5b:a9:
         3f:20:03:59:a6:a8:3f:91:11:1a:db:1d:5d:85:78:58:de:8d:
         67:f9:46:ec:4b:c9:b9:57:93:ea:97:5a:64:24:be:10:a9:ae:
         0e:b8:5a:74:7f:47:75:f3:c5:df:77:13:a0:52:a1:8a:28:61:
         65:23:30:51
-----BEGIN CERTIFICATE-----
MIIDajCCAlKgAwIBAgIBATANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMDkxNDA4MjQzOFoX
DTMxMDkxMjA5MjQzOFowNDEyMDAGA1UEAxMpZ2F0ZWtlZXBlci13ZWJob29rLXNl
cnZpY2UuZ2F0ZWtlZXBlci5zdmMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQDXM3zV5fcWqkYYnd8G/FLsWEnqqjYp3o7dWASw8c9/8lDA22r4lf3Ejepe
6OSvmbyXH5fB4K9Uv5YZqZa9rGipeJfNGDfC2CLveStSNuH0XVYURQQNABIDQuYQ
hjowC/Sbg4WlX+eGCv2G/feS7XX4oEujgzB+YlQCDlecgHl1ch7tCzrSZIGjlueQ
rgOqKjdGjAUUifjvEKGThwiPov9Z8nySjsQ/jQui3ZUUUwwhrrsaGU+Y8GqDgPra
wo+x7L6Nhe6w5i3z7jZNdQXoROTZsYBpu2wQ1uuPfJr3WCZgMe495kf60GzYEGTb
9RIyES8Moo/h+hYwaCs6klfwRZeRAgMBAAGjgY0wgYowDgYDVR0PAQH/BAQDAgWg
MBMGA1UdJQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAU
6b1waa3UAVVw6XqGSZ0q0w4qTQowNAYDVR0RBC0wK4IpZ2F0ZWtlZXBlci13ZWJo
b29rLXNlcnZpY2UuZ2F0ZWtlZXBlci5zdmMwDQYJKoZIhvcNAQELBQADggEBAC7N
jKL83F4qNxJTEk5vJNJPAjLUOVsLn7uvILO914cAcVI3p6uugiZx3J4YnGRFo7Pp
NfcvNcY/kaZfm48Z37KB2uQyBfZ/xyjvsQC5wqExMizjbtCwvycNeJzuJ8JRJNYz
o8Xvi9H5jEhkIRbf35fOdFao3O3CY8jJ4aRwuSRlTx9RZhgZ1wnrl7lSKAXrnhxY
upvdvvHWiGH76izJDXY9kEwG1uuIEiN4EDKdY8kRMgcrlb2FNoirPBxRWfP2cA/9
dlxbqT8gA1mmqD+RERrbHV2FeFjejWf5RuxLyblXk+qXWmQkvhCprg64WnR/R3Xz
xd93E6BSoYooYWUjMFE=
-----END CERTIFICATE-----

served by gatekeeper-controller-manager-7cb847b4f5-s6s87

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1 (0x1)
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O = gatekeeper, CN = gatekeeper-ca
        Validity
            Not Before: Sep 14 08:24:38 2021 GMT
            Not After : Sep 12 09:24:38 2031 GMT
        Subject: CN = gatekeeper-webhook-service.gatekeeper.svc
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                RSA Public-Key: (2048 bit)
                Modulus:
                    00:d7:33:7c:d5:e5:f7:16:aa:46:18:9d:df:06:fc:
                    52:ec:58:49:ea:aa:36:29:de:8e:dd:58:04:b0:f1:
                    cf:7f:f2:50:c0:db:6a:f8:95:fd:c4:8d:ea:5e:e8:
                    e4:af:99:bc:97:1f:97:c1:e0:af:54:bf:96:19:a9:
                    96:bd:ac:68:a9:78:97:cd:18:37:c2:d8:22:ef:79:
                    2b:52:36:e1:f4:5d:56:14:45:04:0d:00:12:03:42:
                    e6:10:86:3a:30:0b:f4:9b:83:85:a5:5f:e7:86:0a:
                    fd:86:fd:f7:92:ed:75:f8:a0:4b:a3:83:30:7e:62:
                    54:02:0e:57:9c:80:79:75:72:1e:ed:0b:3a:d2:64:
                    81:a3:96:e7:90:ae:03:aa:2a:37:46:8c:05:14:89:
                    f8:ef:10:a1:93:87:08:8f:a2:ff:59:f2:7c:92:8e:
                    c4:3f:8d:0b:a2:dd:95:14:53:0c:21:ae:bb:1a:19:
                    4f:98:f0:6a:83:80:fa:da:c2:8f:b1:ec:be:8d:85:
                    ee:b0:e6:2d:f3:ee:36:4d:75:05:e8:44:e4:d9:b1:
                    80:69:bb:6c:10:d6:eb:8f:7c:9a:f7:58:26:60:31:
                    ee:3d:e6:47:fa:d0:6c:d8:10:64:db:f5:12:32:11:
                    2f:0c:a2:8f:e1:fa:16:30:68:2b:3a:92:57:f0:45:
                    97:91
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Authority Key Identifier: 
                keyid:E9:BD:70:69:AD:D4:01:55:70:E9:7A:86:49:9D:2A:D3:0E:2A:4D:0A

            X509v3 Subject Alternative Name: 
                DNS:gatekeeper-webhook-service.gatekeeper.svc
    Signature Algorithm: sha256WithRSAEncryption
         2e:cd:8c:a2:fc:dc:5e:2a:37:12:53:12:4e:6f:24:d2:4f:02:
         32:d4:39:5b:0b:9f:bb:af:20:b3:bd:d7:87:00:71:52:37:a7:
         ab:ae:82:26:71:dc:9e:18:9c:64:45:a3:b3:e9:35:f7:2f:35:
         c6:3f:91:a6:5f:9b:8f:19:df:b2:81:da:e4:32:05:f6:7f:c7:
         28:ef:b1:00:b9:c2:a1:31:32:2c:e3:6e:d0:b0:bf:27:0d:78:
         9c:ee:27:c2:51:24:d6:33:a3:c5:ef:8b:d1:f9:8c:48:64:21:
         16:df:df:97:ce:74:56:a8:dc:ed:c2:63:c8:c9:e1:a4:70:b9:
         24:65:4f:1f:51:66:18:19:d7:09:eb:97:b9:52:28:05:eb:9e:
         1c:58:ba:9b:dd:be:f1:d6:88:61:fb:ea:2c:c9:0d:76:3d:90:
         4c:06:d6:eb:88:12:23:78:10:32:9d:63:c9:11:32:07:2b:95:
         bd:85:36:88:ab:3c:1c:51:59:f3:f6:70:0f:fd:76:5c:5b:a9:
         3f:20:03:59:a6:a8:3f:91:11:1a:db:1d:5d:85:78:58:de:8d:
         67:f9:46:ec:4b:c9:b9:57:93:ea:97:5a:64:24:be:10:a9:ae:
         0e:b8:5a:74:7f:47:75:f3:c5:df:77:13:a0:52:a1:8a:28:61:
         65:23:30:51
-----BEGIN CERTIFICATE-----
MIIDajCCAlKgAwIBAgIBATANBgkqhkiG9w0BAQsFADAtMRMwEQYDVQQKEwpnYXRl
a2VlcGVyMRYwFAYDVQQDEw1nYXRla2VlcGVyLWNhMB4XDTIxMDkxNDA4MjQzOFoX
DTMxMDkxMjA5MjQzOFowNDEyMDAGA1UEAxMpZ2F0ZWtlZXBlci13ZWJob29rLXNl
cnZpY2UuZ2F0ZWtlZXBlci5zdmMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQDXM3zV5fcWqkYYnd8G/FLsWEnqqjYp3o7dWASw8c9/8lDA22r4lf3Ejepe
6OSvmbyXH5fB4K9Uv5YZqZa9rGipeJfNGDfC2CLveStSNuH0XVYURQQNABIDQuYQ
hjowC/Sbg4WlX+eGCv2G/feS7XX4oEujgzB+YlQCDlecgHl1ch7tCzrSZIGjlueQ
rgOqKjdGjAUUifjvEKGThwiPov9Z8nySjsQ/jQui3ZUUUwwhrrsaGU+Y8GqDgPra
wo+x7L6Nhe6w5i3z7jZNdQXoROTZsYBpu2wQ1uuPfJr3WCZgMe495kf60GzYEGTb
9RIyES8Moo/h+hYwaCs6klfwRZeRAgMBAAGjgY0wgYowDgYDVR0PAQH/BAQDAgWg
MBMGA1UdJQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwHwYDVR0jBBgwFoAU
6b1waa3UAVVw6XqGSZ0q0w4qTQowNAYDVR0RBC0wK4IpZ2F0ZWtlZXBlci13ZWJo
b29rLXNlcnZpY2UuZ2F0ZWtlZXBlci5zdmMwDQYJKoZIhvcNAQELBQADggEBAC7N
jKL83F4qNxJTEk5vJNJPAjLUOVsLn7uvILO914cAcVI3p6uugiZx3J4YnGRFo7Pp
NfcvNcY/kaZfm48Z37KB2uQyBfZ/xyjvsQC5wqExMizjbtCwvycNeJzuJ8JRJNYz
o8Xvi9H5jEhkIRbf35fOdFao3O3CY8jJ4aRwuSRlTx9RZhgZ1wnrl7lSKAXrnhxY
upvdvvHWiGH76izJDXY9kEwG1uuIEiN4EDKdY8kRMgcrlb2FNoirPBxRWfP2cA/9
dlxbqT8gA1mmqD+RERrbHV2FeFjejWf5RuxLyblXk+qXWmQkvhCprg64WnR/R3Xz
xd93E6BSoYooYWUjMFE=
-----END CERTIFICATE-----

Similar to the last occurence, .crt stored in secret and validatingwebhookconfiguration are the same, while they are different from the one served by the gatekeeper pods.

It seems that besides the possible race condition between the two pods, there is also another issue that the gatekeeper pod is not able to serve webhook request using the secret stored in K8S secret. It can due to:

  • secret cannot be loaded to /certs (less possible)
  • certs stored in /certs are not loaded into gatekeeper runtime (more possible)

Next, I will try to disable internal cert roration and use cert-manager with short cert validity to test.

Any hint?

ethernoy avatar Nov 16 '21 03:11 ethernoy

Attached log file: sit-troubleshoot-logs.xlsx

ethernoy avatar Nov 16 '21 05:11 ethernoy

Thanks for digging in to this!

Some latency is expected, but certainly not days. We rely on the core controller-runtime library to load/watch certs. The library that does the heavy lifting is called certwatcher:

https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/certwatcher/certwatcher.go

Which then provides the certs to use for responding to HTTPS requests:

https://github.com/kubernetes-sigs/controller-runtime/blob/4d10a0615b11507451ecb58bfd59f0f6ef313a29/pkg/webhook/server.go#L219-L239

It looks like cert watcher relies on fsnotify to receive the events that trigger a cert reload:

https://github.com/kubernetes-sigs/controller-runtime/blob/4d10a0615b11507451ecb58bfd59f0f6ef313a29/pkg/certwatcher/certwatcher.go#L37

Any reason to think fsnotify wouldn't work with your file system?

https://github.com/fsnotify/fsnotify

maxsmythe avatar Nov 16 '21 06:11 maxsmythe

Thanks for digging in to this!

Some latency is expected, but certainly not days. We rely on the core controller-runtime library to load/watch certs. The library that does the heavy lifting is called certwatcher:

https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/certwatcher/certwatcher.go

Which then provides the certs to use for responding to HTTPS requests:

https://github.com/kubernetes-sigs/controller-runtime/blob/4d10a0615b11507451ecb58bfd59f0f6ef313a29/pkg/webhook/server.go#L219-L239

It looks like cert watcher relies on fsnotify to receive the events that trigger a cert reload:

https://github.com/kubernetes-sigs/controller-runtime/blob/4d10a0615b11507451ecb58bfd59f0f6ef313a29/pkg/certwatcher/certwatcher.go#L37

Any reason to think fsnotify wouldn't work with your file system?

https://github.com/fsnotify/fsnotify

I believe fsnotify can work on my file system as I can find multiple instances of successfully updated TLS certificate events from gatekeeper in the past:

image

I just did a little experiment: I started the third pod in the problematic gatekeeper controller manager deployment (2 pods are in the said malfunctioning state). The third pod can work normally, while the rest are still serving the invalid certificate. It can probably prove that the issue is because the original two pods did not correctly mount the updated certificate due to unknown reasons, which is quite interesting.

Update: Another interesting finding is after the bad certificate issue occurred, the logger "cert-rotation" (from gatekeeper) still print logs about "Ensuring CA cert" and "no cert refresh needed", but the logger "certwatcher" (from controller runtime) no longer print out any log. Seems the goroutine of certwatcher is no longer active after the issue occurred, which resulted in the case that the cert secret is still being refreshed, while it is no longer loaded into runtime. Notice this log:

{"level":"error","ts":1636596258.3283987,"logger":"controller-runtime.certwatcher","msg":"error re-watching file","error":"no such file or directory","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/open-policy-agent/gatekeeper/vendor/github.com/go-logr/zapr/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/log.(*DelegatingLogger).Error\n\t/go/src/github.com/open-policy-agent/gatekeeper/vendor/sigs.k8s.io/controller-runtime/pkg/log/deleg.go:144\nsigs.k8s.io/controller-runtime/pkg/webhook/internal/certwatcher.(*CertWatcher).handleEvent\n\t/go/src/github.com/open-policy-agent/gatekeeper/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/internal/certwatcher/certwatcher.go:144\nsigs.k8s.io/controller-runtime/pkg/webhook/internal/certwatcher.(*CertWatcher).Watch\n\t/go/src/github.com/open-policy-agent/gatekeeper/vendor/sigs.k8s.io/controller-runtime/pkg/webhook/internal/certwatcher/certwatcher.go:102"}

Update 2: One more finding on secret update history:

managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:artifact.spinnaker.io/location: {}
          f:artifact.spinnaker.io/name: {}
          f:artifact.spinnaker.io/type: {}
          f:artifact.spinnaker.io/version: {}
          f:kubectl.kubernetes.io/last-applied-configuration: {}
          f:moniker.spinnaker.io/application: {}
          f:moniker.spinnaker.io/cluster: {}
          f:strategy.spinnaker.io/versioned: {}
        f:labels:
          .: {}
          f:app: {}
          f:app.kubernetes.io/managed-by: {}
          f:app.kubernetes.io/name: {}
          f:chart: {}
          f:gatekeeper.sh/system: {}
          f:heritage: {}
          f:release: {}
      f:type: {}
    manager: kubectl
    operation: Update
    time: "2021-11-11T02:03:22Z"
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:ca.crt: {}
        f:ca.key: {}
        f:tls.crt: {}
        f:tls.key: {}
    manager: gatekeeper
    operation: Update
    time: "2021-11-11T11:57:47Z"

Seems my CICD tool re-applied the manifests of gatekeeper helm chart which cleaned all data fields in the existing secret. This action killed the fsnotify goroutine, which prevented the existing gatekeeper pods from loading new certificates from secret. However the cert-rotator is still working, and it refreshed the certificate stored in secret and validatingwebhookconfiguration 10 hours later. After that, K8S apiserver started to use new CA cert to make TLS connection to gatekeeper webhook server, which still used the original certificate stored in previous secret. This should be the root cause of the issue. I will make a timeline chart to illustrate this with better visibility later.

If allowed, I am happy to purpose a solution and make PR to this project :)

ethernoy avatar Nov 16 '21 07:11 ethernoy

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 23 '22 03:07 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 11 '22 05:10 stale[bot]