Upgrading from ECK 2.4.0 to latest version may fail
Got this error for one of my Elasticsearch cluster while upgrading from 2.4.0 to main:
400 Bad Request: {Status:400 Error:{CausedBy:{Reason: Type:}
Reason:
Desired nodes with history [d7bf9e8e-47a0-40ad-8156-400ae519eb6c] and version [2] already exists with a different definition
I think (not 💯 sure yet) this is because of https://github.com/elastic/cloud-on-k8s/pull/5950: the Elasticsearch configuration has changed, but the metadata.generation of the Elasticsearch resource, used as the version field for the desired nodes API, is still the same.
I wonder if we should have an "upgrade" e2e pipeline, something like our upgrade-test-harness, that should be run automatically when submitting a PR in order to detect this kind of issue.
Similar problem https://github.com/elastic/cloud-on-k8s/issues/6027 there it is a PVC resize that leads to multiple updates with an unchanging spec.
We discussed potential solutions and @barkbay suggested two approaches:
- disabling desired_nodes for the next release
- switching to a conditional PUT after GET approach:
GETthe_lastestdesired nodes topology from Elasticsearch and compare with the expected desired nodes- If the topologies are the same stop
- If they differ take the
versionreturned from theGETcall and increment itPUTthe new topology
Comparing the _latest desired nodes returned via the Elasticsearch API with the expected values turns out to be trickier than I thought:
{"service.version": "2.5.0-SNAPSHOT+dff6b534", "iteration": "1", "namespace": "default", "es_name": "autoscaling-sample", "diff": ["slice[0].Settings.map[node].map[store].map[allow_mmap]: string != bool", "slice[0].Settings.map[xpack].map[security].map[http].map[ssl].map[enabled]: string != bool", "slice[0].Settings.map[xpack].map[security].map[authc].map[realms].map[native].map[native1].map[order]: string != int64", "slice[0].Settings.map[xpack].map[security].map[authc].map[realms].map[file].map[file1].map[order]: string != int64", "slice[0].Memory: 3gb != 3221225472b", "slice[0].Storage: 4gb != 4294967296b", "slice[1].Settings.map[xpack].map[security].map[http].map[ssl].map[enabled]: string != bool", "slice[1].Settings.map[xpack].map[security].map[authc].map[realms].map[native].map[native1].map[order]: string != int64", "slice[1].Settings.map[xpack].map[security].map[authc].map[realms].map[file].map[file1].map[order]: string != int64", "slice[1].Settings.map[node].map[store].map[allow_mmap]: string != bool"]}
Elasticsearch transforms the submitted data: it stringifies all the booleans and integers and it also transforms the resource units to the largest applicable i.e instead or bytes of memory it returns gigabytes.
My fear is that implementing a comparison after mirroring the same transformations might be brittle and am thinking we should just stick to the current approach of updating at each reconciliation with an incremented version number.
I am thinking about ways to optimise this. But it is quite involved. One idea follow below.
First iteration:
PUTdesired nodes topology and calculate the hash of the submitted request payload- Store it in an annotation together with the
versionfor example the orchestration hints annotation
Subbsequent iterations:
GETthe_latestdesired nodes topology from Elasticsearch and stored version and hash from the annotation- Calculate the hash of the expected desired nodes and compare hash and version. If the hash/version are the same stop
- If they differ take the
versionreturned from theGETcall and increment itPUTthe new topology - update the orchestration hint annotation with the new version and hash
This would address the following concerns:
- reduce the number of updates to the Elasticsearch API with identical topologies
- handles the case where a third party changes or deletes the desired nodes by
GETbeforePUT - in steady state no updates are posted to Elasticsearch (e.g. if reconciliation is triggered by cache refresh, operator restart or spec changes that have not relevance for the desired nodes API)
This comes with the downside of additional complexity and annotation updates to the ES resource