OpenSearch
OpenSearch copied to clipboard
[BUG] IndexSettings gets into invalid states on a cluster
Describe the bug
During snapshot restore and other unknown scenarios settings are not validated or migrated to be compatible with future versions allowing clusters to get into states where new nodes cannot join the cluster because of validation checks. I have debugged clusters in this state, and have found records indicating these symptom has occurred many times.
Related component
Storage:Snapshots
To Reproduce
While I've seen clusters with green indices in this state, I don't know how that repo was possible
- Create a snapshot
- Restore snapshot with settings
POST /_snapshot/my-opensearch-repo/my-first-snapshot/_restore
{
"indices": "opendistro-reports-definitions",
"ignore_unavailable": true,
"include_global_state": false,
"index_settings": {
"index.mapper.dynamic": true
}
}
Alternative Repro
- Checkout this branch https://github.com/opensearch-project/OpenSearch/compare/main...peternied:OpenSearch-1:repro-restore-to-invalid-state
-
./gradlew :server:internalClusterTest --tests org.opensearch.snapshots.RestoreSnapshotInvalidStateIT
Expected behavior
Indices should not be created that are in invalid states, it seems there are ways to modifed IndexSettings that bypass checks - but prevent reloading in future scenarios.
MapperService
In the specific repo index.mapper.dynamic
is not check as being valid during creation, but when MapperService is constructed. While this is discoverable when looking at failed shard counts, the index still exists in the cluster state.
https://github.com/opensearch-project/OpenSearch/blob/b19e4270f07e80b7d4fcc9473ff46deb61e4719c/server/src/main/java/org/opensearch/index/mapper/MapperService.java#L263-L265
MergeSchedulerConfig
There was a case where the max_thread_count
was lower than max_merge_count
. When IndicesClusterStateService.deleteIndices(...) was called it threw an exception because it couldn't create the IndexSettings.
https://github.com/opensearch-project/OpenSearch/blob/b19e4270f07e80b7d4fcc9473ff46deb61e4719c/server/src/main/java/org/opensearch/index/MergeSchedulerConfig.java#L139-L143
Additional Details
Plugins Please list all plugins currently enabled.
Screenshots If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information): OpenSearch 2.3 & OpenSearch 2.7
Additional context
- https://github.com/opensearch-project/OpenSearch/pull/11193
- https://github.com/opensearch-project/OpenSearch/issues/3879