Peter Brachwitz
Peter Brachwitz
Ok so this is not the exact same test but `TestVersionUpgradeToLatest8x/Elasticsearch_cluster_health_should_not_have_been_red_during_mutation_process` but I think it might be the same problem. I am trying to summarise what I saw in a...
I took a look today and tried to figure out why this is failing. Just dropping my raw notes here for this failed run https://devops-ci.elastic.co/job/cloud-on-k8s-e2e-tests-gke-k8s-versions/771//testReport * Jul 8, 2022 @...
The request tracing added in https://github.com/elastic/cloud-on-k8s/pull/5869 gives some additional clues. Based on a [recent test failure ](https://devops-ci.elastic.co/job/cloud-on-k8s-e2e-tests-kind-k8s-versions/793/testReport/github/com_elastic_cloud-on-k8s_v2_test_e2e_es/Run_tests_on_different_versions_of_vanilla_K8s___1_24_1_IPv6___TestMutationSecondMasterSetDown_Elasticsearch_cluster_health_should_not_have_been_red_during_mutation_process/) ``` Elasticsearch cluster health check failure at 2022-07-12 03:38:17.428219659 +0000 UTC m=+3641.730346691: elasticsearch...
The 400 BAD REQUEST log entries happen if the client cancels the request before the response has been calculated. Cancelled tasks that produce HTTP results in Elasticsearch default then to...
> If I change the operator identifier to something other than default, will I also have to explicitly change the managed-by annotation for every cluster? Yes, but why would you...
> The user experience of having to change labels or annotations on the ECK manifest and on the Elasticsearch/Kibana/APM manifests seems a bit complex to me I think that is...
It looks like this feature needs to be turned on explicitly though: https://docs.microsoft.com/en-us/azure/virtual-machines/linux/expand-disks#expand-an-azure-managed-disk
Regarding the excessive PVC updates: we have an unconditional update statement here that should probably check first whether an update is actually necessary: https://github.com/elastic/cloud-on-k8s/blob/9be005d2a4b6e26f98ff9c520ea53ae4d6cc9455/pkg/controller/elasticsearch/driver/pvc_owner.go#L44 https://github.com/elastic/cloud-on-k8s/issues/5451 to fix this
The secret deletions are a bit harder to track. But this seems to be a good candidate: https://github.com/elastic/cloud-on-k8s/blob/16180dd961b596d15e035feadb2bbaf27a28066c/pkg/controller/elasticsearch/driver/nodes.go#L117 Unconditional delete to clean up a legacy secret we don't actually use...
This has not happened in the last 6 months and we are a couple of minors ahead of 8.3 by now. I am closing this for the moment we can...