redpanda
redpanda copied to clipboard
Failure in k8s-operator endpoint-template (PostStart hook)
https://buildkite.com/redpanda/redpanda/builds/18840#01849373-8425-495a-95e6-7c462a832601
--- FAIL: kuttl (2273.54s)
--- FAIL: kuttl/harness (0.00s)
--- PASS: kuttl/harness/resources-redpanda (9.51s)
--- PASS: kuttl/harness/additional-configuration (218.28s)
--- PASS: kuttl/harness/redpanda-schema-registry-sasl (401.49s)
--- PASS: kuttl/harness/produce-tls (12.22s)
--- PASS: kuttl/harness/pandaproxy-produce-consume-tls-client (10.17s)
--- PASS: kuttl/harness/pandaproxy-produce-consume-sasl (46.31s)
--- PASS: kuttl/harness/pandaproxy-produce-consume (79.35s)
--- PASS: kuttl/harness/node-select-tolerations (6.38s)
--- PASS: kuttl/harness/external-connectivity (9.23s)
--- PASS: kuttl/harness/explicit-ports-tls (10.15s)
--- PASS: kuttl/harness/redpanda-schema-registry (649.27s)
--- PASS: kuttl/harness/create-topic-with-client-auth (11.27s)
--- FAIL: kuttl/harness/endpoint-template (300.70s)
--- PASS: kuttl/harness/create-topic-given-issuer (47.35s)
--- PASS: kuttl/harness/create-topic-given-issuer-with-client-auth (55.63s)
--- PASS: kuttl/harness/create-topic-given-cert-secret (52.29s)
--- PASS: kuttl/harness/create-topic-given-cm-secret-client-auth (56.34s)
--- PASS: kuttl/harness/console-admin-api (164.97s)
--- PASS: kuttl/harness/console-kafka-mtls (172.62s)
--- PASS: kuttl/harness/admin-api-tls-client-auth (11.22s)
--- PASS: kuttl/harness/admin-api-tls (12.21s)
--- PASS: kuttl/harness/admin-api (10.09s)
--- PASS: kuttl/harness/centralized-configuration-bootstrap (43.42s)
--- PASS: kuttl/harness/confluent-schema-registry (60.37s)
--- PASS: kuttl/harness/console (164.46s)
--- PASS: kuttl/harness/centralized-configuration-upgrade (117.20s)
--- PASS: kuttl/harness/centralized-configuration-tls (218.49s)
--- PASS: kuttl/harness/update-image-and-node-port (294.65s)
--- PASS: kuttl/harness/update-image-tls-client-auth (291.80s)
--- PASS: kuttl/harness/update-image-tls (238.90s)
--- PASS: kuttl/harness/update (85.01s)
--- PASS: kuttl/harness/centralized-configuration-drift (96.50s)
--- PASS: kuttl/harness/superusers-prefix (194.81s)
FAIL
This is miss leading. The amount of logs generated by kuttl from unstable k8s test hides the true problem in the main test suite.
Taken from raw output.
--- FAIL: kuttl (2273.54s)
--- FAIL: kuttl/harness (0.00s)
--- PASS: kuttl/harness/resources-redpanda (9.51s)
--- PASS: kuttl/harness/additional-configuration (218.28s)
--- PASS: kuttl/harness/redpanda-schema-registry-sasl (401.49s)
--- PASS: kuttl/harness/produce-tls (12.22s)
--- PASS: kuttl/harness/pandaproxy-produce-consume-tls-client (10.17s)
--- PASS: kuttl/harness/pandaproxy-produce-consume-sasl (46.31s)
--- PASS: kuttl/harness/pandaproxy-produce-consume (79.35s)
--- PASS: kuttl/harness/node-select-tolerations (6.38s)
--- PASS: kuttl/harness/external-connectivity (9.23s)
--- PASS: kuttl/harness/explicit-ports-tls (10.15s)
--- PASS: kuttl/harness/redpanda-schema-registry (649.27s)
--- PASS: kuttl/harness/create-topic-with-client-auth (11.27s)
--- FAIL: kuttl/harness/endpoint-template (300.70s)
--- PASS: kuttl/harness/create-topic-given-issuer (47.35s)
--- PASS: kuttl/harness/create-topic-given-issuer-with-client-auth (55.63s)
--- PASS: kuttl/harness/create-topic-given-cert-secret (52.29s)
--- PASS: kuttl/harness/create-topic-given-cm-secret-client-auth (56.34s)
--- PASS: kuttl/harness/console-admin-api (164.97s)
--- PASS: kuttl/harness/console-kafka-mtls (172.62s)
--- PASS: kuttl/harness/admin-api-tls-client-auth (11.22s)
--- PASS: kuttl/harness/admin-api-tls (12.21s)
--- PASS: kuttl/harness/admin-api (10.09s)
--- PASS: kuttl/harness/centralized-configuration-bootstrap (43.42s)
--- PASS: kuttl/harness/confluent-schema-registry (60.37s)
--- PASS: kuttl/harness/console (164.46s)
--- PASS: kuttl/harness/centralized-configuration-upgrade (117.20s)
--- PASS: kuttl/harness/centralized-configuration-tls (218.49s)
--- PASS: kuttl/harness/update-image-and-node-port (294.65s)
--- PASS: kuttl/harness/update-image-tls-client-auth (291.80s)
--- PASS: kuttl/harness/update-image-tls (238.90s)
--- PASS: kuttl/harness/update (85.01s)
--- PASS: kuttl/harness/centralized-configuration-drift (96.50s)
--- PASS: kuttl/harness/superusers-prefix (194.81s)
FAIL
Kuttl returned exit code 1. Creating artifacts tarball
Events from kuttle output doesn't show the root cause of pod-0 not being scheduled
logger.go:42: 06:03:12 | endpoint-template | endpoint-template events from ns kuttl-test-select-louse:
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-admin-client Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-kafka-root-certificate-7wd6c cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-kafka-root-certificate-7wd6c CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-kafka-root-certificate Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-kafka-root-certificate Generated Stored new private key in temporary Secret resource "endpoint-template-kafka-root-certificate-ldcxk"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-kafka-root-certificate Requested Created new CertificateRequest resource "endpoint-template-kafka-root-certificate-7wd6c"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-kafka-root-certificate Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Issuer.cert-manager.io endpoint-template-kafka-root-issuer KeyPairVerified Signing CA verified
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-operator-client Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-node Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-root-certificate Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Warning Issuer.cert-manager.io endpoint-template-proxy-root-issuer ErrGetKeyPair Error getting keypair for CA issuer: secret "endpoint-template-proxy-root-certificate" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Warning Issuer.cert-manager.io endpoint-template-proxy-root-issuer ErrInitIssuer Error initializing issuer: secret "endpoint-template-proxy-root-certificate" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-redpanda Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:11 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-user-client Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-0 WaitForFirstConsumer waiting for first consumer to be created before binding
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-0 ExternalProvisioning waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-0 Provisioning External provisioner is provisioning volume for claim "kuttl-test-select-louse/datadir-endpoint-template-0"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-1 WaitForFirstConsumer waiting for first consumer to be created before binding
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-1 ExternalProvisioning waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-1 Provisioning External provisioner is provisioning volume for claim "kuttl-test-select-louse/datadir-endpoint-template-1"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-2 WaitForFirstConsumer waiting for first consumer to be created before binding
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-2 ExternalProvisioning waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-2 Provisioning External provisioner is provisioning volume for claim "kuttl-test-select-louse/datadir-endpoint-template-2"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-admin-client-vrzps cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-admin-client-vrzps CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-admin-client Generated Stored new private key in temporary Secret resource "endpoint-template-admin-client-pzjjd"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-admin-client Requested Created new CertificateRequest resource "endpoint-template-admin-client-vrzps"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-admin-client Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-operator-client-gtb6j cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-operator-client-gtb6j CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-operator-client Generated Stored new private key in temporary Secret resource "endpoint-template-operator-client-jt6hs"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-operator-client Requested Created new CertificateRequest resource "endpoint-template-operator-client-gtb6j"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-operator-client Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-client Issuing Issuing certificate as Secret does not exist
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-client Generated Stored new private key in temporary Secret resource "endpoint-template-proxy-api-client-xlc78"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-node Generated Stored new private key in temporary Secret resource "endpoint-template-proxy-api-node-l5whb"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-root-certificate-6r8wx cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-root-certificate-6r8wx CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-root-certificate Generated Stored new private key in temporary Secret resource "endpoint-template-proxy-root-certificate-dbbm2"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-root-certificate Requested Created new CertificateRequest resource "endpoint-template-proxy-root-certificate-6r8wx"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-root-certificate Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-redpanda-fwzf7 cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-redpanda-fwzf7 CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-redpanda Generated Stored new private key in temporary Secret resource "endpoint-template-redpanda-5pntk"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-redpanda Requested Created new CertificateRequest resource "endpoint-template-redpanda-fwzf7"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-redpanda Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-user-client Generated Stored new private key in temporary Secret resource "endpoint-template-user-client-hxcdh"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal PodDisruptionBudget.policy endpoint-template NoPods No matching pods found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal StatefulSet.apps endpoint-template SuccessfulCreate create Claim datadir-endpoint-template-0 Pod endpoint-template-0 in StatefulSet endpoint-template success
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal StatefulSet.apps endpoint-template SuccessfulCreate create Pod endpoint-template-0 in StatefulSet endpoint-template successful
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal StatefulSet.apps endpoint-template SuccessfulCreate create Claim datadir-endpoint-template-1 Pod endpoint-template-1 in StatefulSet endpoint-template success
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal StatefulSet.apps endpoint-template SuccessfulCreate create Pod endpoint-template-1 in StatefulSet endpoint-template successful
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal StatefulSet.apps endpoint-template SuccessfulCreate create Claim datadir-endpoint-template-2 Pod endpoint-template-2 in StatefulSet endpoint-template success
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:12 +0000 UTC Normal StatefulSet.apps endpoint-template SuccessfulCreate create Pod endpoint-template-2 in StatefulSet endpoint-template successful
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-api-client-c6tvl cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-api-client-c6tvl IssuerNotReady Referenced issuer does not have a Ready status condition
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-client Requested Created new CertificateRequest resource "endpoint-template-proxy-api-client-c6tvl"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-api-node-nkxz5 cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-api-node-nkxz5 IssuerNotReady Referenced issuer does not have a Ready status condition
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-node Requested Created new CertificateRequest resource "endpoint-template-proxy-api-node-nkxz5"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-user-client-nkxj8 cert-manager.io Certificate request has been approved by cert-manager.io
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-user-client-nkxj8 CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:13 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-user-client Requested Created new CertificateRequest resource "endpoint-template-user-client-nkxj8"
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:14 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-user-client Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-0 ProvisioningSucceeded Successfully provisioned volume pvc-dd6f0158-409c-419b-bfe1-fa1686cfcd87
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-1 ProvisioningSucceeded Successfully provisioned volume pvc-480df756-64fd-4d71-9968-d7ec2088faa6
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Normal PersistentVolumeClaim datadir-endpoint-template-2 ProvisioningSucceeded Successfully provisioned volume pvc-d9ab6a89-740e-4369-8b7d-6962fcc6022e
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Normal Pod endpoint-template-0 Scheduled Successfully assigned kuttl-test-select-louse/endpoint-template-0 to kind-worker
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Warning Pod endpoint-template-0 FailedMount MountVolume.SetUp failed for volume "tlspandaproxyca" : secret "endpoint-template-proxy-api-client" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Warning Pod endpoint-template-0 FailedMount MountVolume.SetUp failed for volume "tlspandaproxycert" : secret "endpoint-template-proxy-api-node" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Normal Pod endpoint-template-1 Scheduled Successfully assigned kuttl-test-select-louse/endpoint-template-1 to kind-worker6
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Warning Pod endpoint-template-1 FailedMount MountVolume.SetUp failed for volume "tlspandaproxycert" : secret "endpoint-template-proxy-api-node" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:15 +0000 UTC Warning Pod endpoint-template-1 FailedMount MountVolume.SetUp failed for volume "tlspandaproxyca" : secret "endpoint-template-proxy-api-client" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Normal Pod endpoint-template-2 Scheduled Successfully assigned kuttl-test-select-louse/endpoint-template-2 to kind-worker8
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Warning Pod endpoint-template-2 FailedMount MountVolume.SetUp failed for volume "tlspandaproxyca" : secret "endpoint-template-proxy-api-client" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Warning Pod endpoint-template-2 FailedMount MountVolume.SetUp failed for volume "tlspandaproxycert" : secret "endpoint-template-proxy-api-node" not found
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-api-client-c6tvl CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-client Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Normal CertificateRequest.cert-manager.io endpoint-template-proxy-api-node-nkxz5 CertificateIssued Certificate fetched from issuer successfully
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Normal Certificate.cert-manager.io endpoint-template-proxy-api-node Issuing The certificate has been successfully issued
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:16 +0000 UTC Normal Issuer.cert-manager.io endpoint-template-proxy-root-issuer KeyPairVerified Signing CA verified
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:18 +0000 UTC Normal Pod endpoint-template-2.spec.initContainers{redpanda-configurator} Pulled Container image "localhost/configurator:dev" already present on machine
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:18 +0000 UTC Normal Pod endpoint-template-2.spec.initContainers{redpanda-configurator} Created Created container redpanda-configurator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:18 +0000 UTC Normal Pod endpoint-template-2.spec.initContainers{redpanda-configurator} Started Started container redpanda-configurator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-0.spec.initContainers{redpanda-configurator} Pulled Container image "localhost/configurator:dev" already present on machine
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-0.spec.initContainers{redpanda-configurator} Created Created container redpanda-configurator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-0.spec.initContainers{redpanda-configurator} Started Started container redpanda-configurator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-1.spec.initContainers{redpanda-configurator} Pulled Container image "localhost/configurator:dev" already present on machine
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-1.spec.initContainers{redpanda-configurator} Created Created container redpanda-configurator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-1.spec.initContainers{redpanda-configurator} Started Started container redpanda-configurator
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-2.spec.containers{redpanda} Pulled Container image "localhost/redpanda:dev" already present on machine
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-2.spec.containers{redpanda} Created Created container redpanda
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:19 +0000 UTC Normal Pod endpoint-template-2.spec.containers{redpanda} Started Started container redpanda
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC Normal Pod endpoint-template-0.spec.containers{redpanda} Pulled Container image "localhost/redpanda:dev" already present on machine
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC Normal Pod endpoint-template-0.spec.containers{redpanda} Created Created container redpanda
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC Normal Pod endpoint-template-0.spec.containers{redpanda} Started Started container redpanda
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC Normal Pod endpoint-template-1.spec.containers{redpanda} Pulled Container image "localhost/redpanda:dev" already present on machine
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC Normal Pod endpoint-template-1.spec.containers{redpanda} Created Created container redpanda
logger.go:42: 06:03:12 | endpoint-template | 2022-11-20 05:58:20 +0000 UTC Normal Pod endpoint-template-1.spec.containers{redpanda} Started Started container redpanda
The kubectl get pod output shows the following problem
...
{
"lastProbeTime": null,
"lastTransitionTime": "2022-11-20T05:58:15Z",
"message": "containers with unready status: [redpanda]",
"reason": "ContainersNotReady",
"status": "False",
"type": "Ready"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2022-11-20T05:58:15Z",
"message": "containers with unready status: [redpanda]",
"reason": "ContainersNotReady",
"status": "False",
"type": "ContainersReady"
}
],
"containerStatuses": [
{
"image": "localhost/redpanda:dev",
"imageID": "",
"lastState": {},
"name": "redpanda",
"ready": false,
"restartCount": 0,
"started": false,
"state": {
"waiting": {
"reason": "PodInitializing"
}
}
}
],
...
Nov 20 06:05:15 kind-worker kubelet[258]: E1120 06:05:15.425494 258 pod_workers.go:190] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"redpanda\" with PostStartHookError: command '/scripts/postStart.sh' exited with 137: : \"Exec lifecycle hook ([/scripts/postStart.sh]) for Container \\\"redpanda\\\" in Pod \\\"endpoint-template-0_kuttl-test-select-louse(35d2ae0a-bd23-41f7-81a4-89ac99f99f4f)\\\" failed - error: command '/scripts/postStart.sh' exited with 137: , message: \\\"\\\"\"" pod="kuttl-test-select-louse/endpoint-template-0" podUID=35d2ae0a-bd23-41f7-81a4-89ac99f99f4f
For what ever reason the postStart script failed.
Second time https://buildkite.com/redpanda/redpanda/builds/18895#01849d6c-9736-4275-b872-dd1116132d33 the same problem
--- FAIL: kuttl/harness/endpoint-template (601.47s)
Third time https://buildkite.com/redpanda/redpanda/builds/18875#01849b55-2a33-426c-9267-ae7b4410d220 the same problem
From this test run one pod named ednpoint-template-1
couldn't finish PostStart hook. The following logs was gathered from the Redpanda container:
2022-11-21T19:11:14.036075263Z stderr F INFO 2022-11-21 19:11:14,035 [shard 0] cluster - cluster_discovery.cc:147 - Requesting node ID for node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261} from {host: endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local., port: 33145}
2022-11-21T19:11:14.036108999Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036312853Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
2022-11-21T19:11:14.036370634Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] rpc - transport.cc:308 - RPC Client probes: { requests_sent: 0, requests_pending: 0, requests_completed: 0, request_errors: 0, request_timeouts: 0, in_bytes: 0, out_bytes: 0, connects: 0, connections: 0, connection_errors: 1, read_dispatch_errors: 0, corrupted_headers: 0, server_correlation_errors: 0, client_correlation_errors: 0, requests_blocked_memory: 0 }
2022-11-21T19:11:14.036411964Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:169 - Error registering node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261}, retrying: std::__1::system_error (error C-Ares:4, endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local.: Not found)
2022-11-21T19:11:14.036415432Z stderr F INFO 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:147 - Requesting node ID for node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261} from {host: endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local., port: 33145}
2022-11-21T19:11:14.036420704Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036604935Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
2022-11-21T19:11:14.036633878Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] rpc - transport.cc:308 - RPC Client probes: { requests_sent: 0, requests_pending: 0, requests_completed: 0, request_errors: 0, request_timeouts: 0, in_bytes: 0, out_bytes: 0, connects: 0, connections: 0, connection_errors: 1, read_dispatch_errors: 0, corrupted_headers: 0, server_correlation_errors: 0, client_correlation_errors: 0, requests_blocked_memory: 0 }
2022-11-21T19:11:14.036663665Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:169 - Error registering node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261}, retrying: std::__1::system_error (error C-Ares:4, endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local.: Not found)
2022-11-21T19:11:14.036678215Z stderr F INFO 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:147 - Requesting node ID for node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261} from {host: endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local., port: 33145}
2022-11-21T19:11:14.036689774Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036817588Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
2022-11-21T19:11:14.036851303Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] rpc - transport.cc:308 - RPC Client probes: { requests_sent: 0, requests_pending: 0, requests_completed: 0, request_errors: 0, request_timeouts: 0, in_bytes: 0, out_bytes: 0, connects: 0, connections: 0, connection_errors: 1, read_dispatch_errors: 0, corrupted_headers: 0, server_correlation_errors: 0, client_correlation_errors: 0, requests_blocked_memory: 0 }
2022-11-21T19:11:14.03687066Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] cluster - cluster_discovery.cc:169 - Error registering node UUID {6d0bca35-3ff2-4a4e-b3ce-720843bf0261}, retrying: std::__1::system_error (error C-Ares:4, endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local.: Not found)
The script that tries to remove maintenance mode
#!/usr/bin/env bash
set -e
until NODE_ID=$(curl --silent --fail http://${POD_NAME}.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local:9644/v1/node_config | grep -o '\"node_id\":[^,}]*' | grep -o '[^: ]*$'); do
sleep 0.5
done
echo "Clearing maintenance mode on node ${NODE_ID}"
until [ "${status:-}" = "200" ] || [ "${status:-}" = "400" ]; do
status=$(curl -X DELETE --silent -o /dev/null -w "%{http_code}" http://${POD_NAME}.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local:9644/v1/brokers/${NODE_ID}/maintenance)
sleep 0.5
done
Redpanda config
redpanda:
data_directory: /var/lib/redpanda/data
empty_seed_starts_cluster: false
seed_servers:
- host:
address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 33145
- host:
address: endpoint-template-1.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 33145
- host:
address: endpoint-template-2.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 33145
rpc_server:
address: 0.0.0.0
port: 33145
kafka_api:
- address: 0.0.0.0
port: 9092
name: kafka
authentication_method: none
- address: 0.0.0.0
port: 9093
name: kafka-external
authentication_method: none
kafka_api_tls:
- name: kafka-external
key_file: /etc/tls/certs/tls.key
cert_file: /etc/tls/certs/tls.crt
truststore_file: /etc/tls/certs/ca/ca.crt
enabled: true
require_client_auth: true
admin:
- address: 0.0.0.0
port: 9644
name: admin
advertised_rpc_api:
address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 33145
advertised_kafka_api:
- address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 9092
name: kafka
- address: 0-5c2a3f8f4c-kafka.example.com
port: 30804
name: kafka-external
developer_mode: true
auto_create_topics_enabled: true
cloud_storage_segment_max_upload_interval_sec: 1800
default_topic_replications: 3
enable_rack_awareness: true
fetch_reads_debounce_timeout: 10
group_topic_partitions: 3
id_allocator_replication: 3
log_segment_size: 536870912
storage_min_free_bytes: 10485760
topic_partitions_per_shard: 1000
transaction_coordinator_replication: 3
rpk:
tune_network: true
tune_disk_scheduler: true
tune_disk_nomerges: true
tune_disk_write_cache: true
tune_disk_irq: true
tune_cpu: true
tune_aio_events: true
tune_clocksource: true
tune_swappiness: true
coredump_dir: /var/lib/redpanda/coredump
tune_ballast_file: true
overprovisioned: true
pandaproxy:
pandaproxy_api:
- address: 0.0.0.0
port: 8082
name: proxy
- address: 0.0.0.0
port: 8083
name: proxy-external
pandaproxy_api_tls:
- name: proxy-external
key_file: /etc/tls/certs/pandaproxy/tls.key
cert_file: /etc/tls/certs/pandaproxy/tls.crt
truststore_file: /etc/tls/certs/pandaproxy/ca/ca.crt
enabled: true
require_client_auth: true
advertised_pandaproxy_api:
- address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 8082
name: proxy
- address: 0-5c2a3f8f4c-pandaproxy.example.com
port: 32576
name: proxy-external
pandaproxy_client:
brokers:
- address: endpoint-template-0.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 9092
- address: endpoint-template-1.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 9092
- address: endpoint-template-2.endpoint-template.kuttl-test-composed-jackass.svc.cluster.local.
port: 9092
schema_registry: {}
From Redpanda's perspective, a couple observations:
- this node believes it isn't a cluster founder, meaning it either found that another node has a cluster UUID (only possible if we have previously been able to send RPCs), or it's not configured as a part of the seed_servers list
- it looks like this is a brand new node that doesn't yet have a node ID (the code path hit means that we're trying to get assigned a node ID because we don't have one stored locally yet)
Just curious if that aligns with what we think the test is doing; is it adding a brand new node to a cluster? For a brand new node, is it possible to skip the maintenance mode disabling? More broadly speaking, if the issue is that we can't send any RPCs while initializing, that seems problematic, and maybe we need to expose a different readiness endpoint for these hooks to use.
2022-11-21T19:11:14.036108999Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-0.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036312853Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
...
2022-11-21T19:11:14.036420704Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-1.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036604935Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
...
2022-11-21T19:11:14.036689774Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query name endpoint-template-2.endpoint-template.kuttl-test-helped-gator.svc.cluster.local. (INET)
2022-11-21T19:11:14.036817588Z stderr F DEBUG 2022-11-21 19:11:14,036 [shard 0] dns_resolver - Query failed: 4
It looks like the node is having trouble sending RPCs to any of the seed nodes. Is that something that we should be expecting when initializing a pod?
once more in a feature branch: https://buildkite.com/redpanda/redpanda/builds/18931#0184a033-7ea9-4375-a819-226a353770d7
Another thing I'm confused about is why this is only happening only sometimes, if the issue truly is that DNS setup cannot proceed until the node exposes its admin endpoint.
Just for posterity, exited with 137
means that it was still trying with the pod was deleted.
It looks like the node is having trouble sending RPCs to any of the seed nodes. Is that something that we should be expecting when initializing a pod?
Thanks @andrwng I will drill down with the DNS (core dns) pod if it was causing the problem.
Just curious if that aligns with what we think the test is doing; is it adding a brand new node to a cluster?
This tests is related to setting up the external advertised kafka api DNS using some template mechanism. It happens that this particular test will create 3 Redpanda cluster. Others tries to create only 1 Redpanda cluster.
For a brand new node, is it possible to skip the maintenance mode disabling?
It's not that easy. We would need to implement some hacks around when node is considered formed and when not. If we can not use Admin API reliably then I we might end up with some kind of workaround.
More broadly speaking, if the issue is that we can't send any RPCs while initializing, that seems problematic, and maybe we need to expose a different readiness endpoint for these hooks to use.
We are not using readiness probe to put Redpanda broker out of maintenance mode. It's PostStart hook https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks
DNS containers doesn't have any useful logs. I don't want to blame kind cluster or buildkite environment, but I run out of options.
Maybe we have too many kind nodes, but it's a stretch hypothesis.
Another occurrence: https://buildkite.com/redpanda/redpanda/builds/19160#0184bd5a-bcef-427d-86f6-47c3cb00947f/6-168
Another: https://buildkite.com/redpanda/redpanda/builds/19308#0184cb41-789f-4675-95f1-ac85941c890e/1464-1492
One more: https://buildkite.com/redpanda/redpanda/builds/20442#01856bee-53ee-4718-add0-9b42681a2f6c
We need to collect previous logs from pods. All we can see is that pod/endpoint-template-1's initContainer was running at the time we collected logs.
We need to collect previous logs from pods. All we can see is that pod/endpoint-template-1's initContainer was running at the time we collected logs.
can that be done automatically and added into the buildkite artifacts so that we can access them?
That will be the next step when this next has someone working on it, yes.
Based on 1 week of failures there was no failures in endpoint-template. I will close this issue. If it reappears I will re-open.