Thibault Richard

Results 88 comments of Thibault Richard

We fail the test as soon as there is a failure using `t.Errorf` despite our tolerance mechanism. 🤦 https://github.com/elastic/cloud-on-k8s/blob/58b5a7302240500f0fd3d0ab13dde2d1385470b6/test/e2e/test/elasticsearch/steps_mutation.go#L159-L161 https://github.com/elastic/cloud-on-k8s/pull/7358 should fix this.

But not the flaky test: https://buildkite.com/elastic/cloud-on-k8s-operator-nightly/builds/381 `steps_mutation.go:159: ContinuousHealthChecks failures count (52) is above the tolerance (15): 52 x [cluster health red]` 52 seems very high.

Similar to: - #5355 Related to: - https://github.com/elastic/kibana/issues/131505

This happened again this night with the same error in Enterprise Search pod log. I have looked at all occurrences since it started (10 times in 2022) and it only...

I admit that it is a bit ambiguous. The autoscaling policies only applies to nodes that have **at least** `data` or exactly `ml` role. In the example, nodes with roles...

The elastic-agent team has been notified of this failure ([->](https://elastic.slack.com/archives/C01QQ449KE1/p1703262817438749)).

Since the default stack version was updated to `8.12.0` (#7483), the e2e tests `TestFleetKubernetesIntegrationRecipe` and `TestFleetAPMIntegrationRecipe` consistently fail.

> What is the ECK configuration that reproduces this so that we can try doing it locally? I reproduce by applying [config/recipes/elastic-agent/fleet-kubernetes-integration.yaml](https://github.com/elastic/cloud-on-k8s/blob/main/config/recipes/elastic-agent/fleet-kubernetes-integration.yaml) with the ECK operator installed with the default...

`/var/log` was never configured in the [fleet-kubernetes-integration.yaml](https://github.com/elastic/cloud-on-k8s/blob/main/config/recipes/elastic-agent/fleet-kubernetes-integration.yaml) recipe used in the test. Using the same manifest just changing the version to `8.11.0` with the same version of ECK and the...

Great investigation, thank you very much. > We no longer publish Beat metrics if the Beat isn't actually ingesting data. I don't think this is strictly a bug in that...