elasticsearch
elasticsearch copied to clipboard
Investigate severity of upgrading to 8.0.1 despite terminal deprecation of old features
Elasticsearch Version
upgrade from 7.x -> 8.0.0
Installed Plugins
No response
Java Version
bundled
OS Version
Linux
Problem Description
Recently we encountered a user issue report where upgrades to 8.0.1 from 7.17.1 caused an unhealthy cluster because the option index.indexing.slowlog.level
wasn't removed prior to upgrading.
We'll use this issue to understand how this happened and ways we can prevent the upgrade before irreversible changes are done to the Elasticsearch metadata or data.
Steps to Reproduce
- Get Elasticsearch 7.9 cluster with the
index.indexing.slowlog.level
option set. - Upgrade to 7.17.1 ignore the deprecation warning
- Upgrade to 8.0.1
Expected behaviour: the 8.0.1 upgrade should fail to start but not upgrade any metadata or data Current behaviour: the upgrade to 8.0.1 proceeds irreversibly modifying the metadata before options are validated and start is unsuccessful.
Logs (if relevant)
No response
Pinging @elastic/es-core-infra (Team:Core/Infra)
looks like indeed there is a bug.
I reproduced this on 7.17.1
PUT test2
{"settings": {
"index.search.slowlog.threshold.query.warn": "100ms",
"index.search.slowlog.threshold.query.info": "100ms",
"index.search.slowlog.threshold.query.debug": "0ms",
"index.search.slowlog.threshold.query.trace": 0,
"index.indexing.slowlog.threshold.index.trace": 0,
"index.indexing.slowlog.source": 5,
"index.indexing.slowlog.level": "INFO"
}
}
#! [index.indexing.slowlog.level] setting was deprecated in Elasticsearch and will be removed in a future release! See the breaking changes documentation for the next major version.
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "test2"
}
Deprecation logs do show this warning as CRITICAL.
I can see in codebase it is correctly set as Property.Deprecated - not DepreactedWarning https://github.com/elastic/elasticsearch/blob/bcfbf000748b538884087623113682165abcb92d/server/src/main/java/org/elasticsearch/index/IndexingSlowLog.java#L76
In my view the problem with deprecation logs in upgrade assistant is that it does not look too encouraging to check them. There is a link and I am not sure many people will click it. Most likely people will just hit upgrade
when there is nothing critical on the first page.
However the deprecation info api do not return the setting as critical
GET /_migration/deprecations
"index_settings" : {
"test2" : [
{
"level" : "warning",
"message" : "Setting [index.indexing.slowlog.level] is deprecated",
"url" : "https://ela.st/es-deprecation-7-slowlog-settings",
"details" : "Remove the [index.indexing.slowlog.level] setting. Use the [index.*.slowlog.threshold] settings to set the log levels.",
"resolve_during_rolling_upgrade" : false,
"_meta" : {
"actions" : [
{
"action_type" : "remove_settings",
"objects" : [
"index.indexing.slowlog.level"
]
}
]
}
}
]
},
data:image/s3,"s3://crabby-images/cf1d5/cf1d55942de769f3d3d9b982dfaa3897abcde799" alt="Screenshot 2022-03-17 at 12 48 15"
the problem is that the check uses incorrect level https://github.com/elastic/elasticsearch/blob/eed16a2886b9d3e0f5c21978a8459bb8defbf606/x-pack/plugin/deprecation/src/main/java/org/elasticsearch/xpack/deprecation/IndexDeprecationChecks.java#L603
will look into a fix
have we written an SKI for this, if not, can we please do that?
just a follow up. In 7.17.2-SNAPSHOT it works fine and upgrade assistant presents critical deprecation:
What surprises me is that when I try reproducing this on my 1node cluster, after upgrade index has status yellow and the level setting is archived (archived.index.indexing.slowlog.level
)
However when I try this on cloud from 7.17.1, after upgrade index is red and the level is not archived..
@grcevski do you have any ideas?
Oh, very interesting. Looking at the current code it looks like we should be archiving and it's what happens for me locally too. I wonder if we have fixed a bug since 8.0.1 that resolved this issue?
I think this got resolved, I'll close this for now as we haven't had this happen since.
Reopening this issue, it looks like customers still experience this issue when upgrading to 8.x from versions <7.17. See https://github.com/elastic/sdh-control-plane/issues/1836
However when I try this on cloud from 7.17.1, after upgrade index is red and the level is not archived..
I expect that's because we only run the settings-archival process in a full cluster restart upgrade, and on Cloud it'll be a rolling upgrade. But that's the wrong fix here, we should not be archiving these settings. BwC on index metadata is supposed to be a function of the index created version and not the cluster version, and these settings were acceptable in 7.x-created indices, so we should be accepting them as-is on these indices even in an 8.x cluster. We can reasonably ignore them I think, but we mustn't reject them as we are doing today.
We can reuse this issue to fix this bug.