OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[BUG] IndexSettings gets into invalid states on a cluster

Open peternied opened this issue 1 year ago • 0 comments

Describe the bug

During snapshot restore and other unknown scenarios settings are not validated or migrated to be compatible with future versions allowing clusters to get into states where new nodes cannot join the cluster because of validation checks. I have debugged clusters in this state, and have found records indicating these symptom has occurred many times.

Related component

Storage:Snapshots

To Reproduce

While I've seen clusters with green indices in this state, I don't know how that repo was possible

  1. Create a snapshot
  2. Restore snapshot with settings
POST /_snapshot/my-opensearch-repo/my-first-snapshot/_restore
{
  "indices": "opendistro-reports-definitions",
  "ignore_unavailable": true,
  "include_global_state": false,
  "index_settings": {
     "index.mapper.dynamic": true
  }
}

Alternative Repro

  1. Checkout this branch https://github.com/opensearch-project/OpenSearch/compare/main...peternied:OpenSearch-1:repro-restore-to-invalid-state
  2. ./gradlew :server:internalClusterTest --tests org.opensearch.snapshots.RestoreSnapshotInvalidStateIT

Expected behavior

Indices should not be created that are in invalid states, it seems there are ways to modifed IndexSettings that bypass checks - but prevent reloading in future scenarios.

MapperService

In the specific repo index.mapper.dynamic is not check as being valid during creation, but when MapperService is constructed. While this is discoverable when looking at failed shard counts, the index still exists in the cluster state.

https://github.com/opensearch-project/OpenSearch/blob/b19e4270f07e80b7d4fcc9473ff46deb61e4719c/server/src/main/java/org/opensearch/index/mapper/MapperService.java#L263-L265

MergeSchedulerConfig

There was a case where the max_thread_count was lower than max_merge_count. When IndicesClusterStateService.deleteIndices(...) was called it threw an exception because it couldn't create the IndexSettings.

https://github.com/opensearch-project/OpenSearch/blob/b19e4270f07e80b7d4fcc9473ff46deb61e4719c/server/src/main/java/org/opensearch/index/MergeSchedulerConfig.java#L139-L143

Additional Details

Plugins Please list all plugins currently enabled.

Screenshots If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information): OpenSearch 2.3 & OpenSearch 2.7

Additional context

  • https://github.com/opensearch-project/OpenSearch/pull/11193
  • https://github.com/opensearch-project/OpenSearch/issues/3879

peternied avatar Feb 16 '24 19:02 peternied