kic-reference-architectures icon indicating copy to clipboard operation
kic-reference-architectures copied to clipboard

bug: observability failures with >= 1.24

Open qdzlug opened this issue 1 year ago • 0 comments

Describe the bug This is currently showing on both Minikube and K3s (which is as far as testing has gone).

When running against K8 1.24, failures are thrown when the observability project runs. The failures look like:

[2022-08-11T15:50:40.859Z]   pulumi:pulumi:Stack (observability-marajenkmkube32):

[2022-08-11T15:50:40.859Z]     Cluster name: microk8s-cluster

[2022-08-11T15:50:40.859Z]     error: update failed

[2022-08-11T15:50:40.859Z]  

[2022-08-11T15:50:40.859Z]     I0811 15:45:43.178488  150088 request.go:601] Waited for 1.191467706s due to client-side throttling, not priority and fairness, request: GET:https://192.168.49.2:8443/apis/admissionregistration.k8s.io/v1/validatingwebhookconfigurations/opentelemetry-operator-validating-webhook-configuration

[2022-08-11T15:50:40.859Z]  

[2022-08-11T15:50:40.859Z]   kubernetes:core/v1:ServiceAccount (opentelemetry-operator-system/opentelemetry-operator-controller-manager):

[2022-08-11T15:50:40.859Z]     error: 1 error occurred:

[2022-08-11T15:50:40.859Z]     	* resource opentelemetry-operator-system/opentelemetry-operator-controller-manager was successfully created, but the Kubernetes API server reported that it failed to fully initialize or become live: Timeout occurred polling for 'opentelemetry-operator-controller-manager'

In the past, this same error occurred due to a mismatch between the version of cert-manager and otel operator (there is a dependency). Not sure if this is the same or not, but it does not occur with the 1.23 line.

To Reproduce Steps to reproduce the behavior:

  1. Deploy to K8 1.24 or greater on Minkube or K3s.
  2. Failure will occur.

Expected behavior Script should run as expected.

Your environment

  • Automation-api branch, but expected to follow all branches.

Additional context We already have pinned Microk8s to 1.23, so we are not seeing this here. We should test this assumption.

qdzlug avatar Aug 11 '22 16:08 qdzlug