OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

[BUG] illegal state: trying to move shard from primary mode to replica mode (Index-type: remote_snapshot)

Open etgraylog opened this issue 1 year ago • 2 comments

Describe the bug

During restart, OpenSearch appears to attempt to relocate the Primary-shard of a remote_snapshot type Index and fails.

This might be an instance of the problem mentioned in https://github.com/opensearch-project/OpenSearch/pull/11563#issuecomment-1857798490.

[2024-02-15T15:43:08,808][INFO ][o.o.c.r.a.a.BalancedShardsAllocator] [10.0.1.146] Swap relocation performed for shard [[index_5][0], node[ECzLzBEhTYmA58qyuEWNaQ], [R], s[STARTED], a[id=WGd5CZOZTf2-qD411BjkoQ]]
[2024-02-15T15:43:09,012][WARN ][o.o.i.c.IndicesClusterStateService] [10.0.1.146] [index_5][0] marking and sending shard failed due to [failed updating shard routing entry]
java.lang.IllegalArgumentException: illegal state: trying to move shard from primary mode to replica mode. Current [index_5][0], node[ECzLzBEhTYmA58qyuEWNaQ], [P], s[STARTED], a[id=WGd5CZOZTf2-qD411BjkoQ], new [index_5][0], node[ECzLzBEhTYmA58qyuEWNaQ], [R], s[STARTED], a[id=WGd5CZOZTf2-qD411BjkoQ]
	at org.opensearch.index.shard.IndexShard.updateShardState(IndexShard.java:597) ~[opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.indices.cluster.IndicesClusterStateService.updateShard(IndicesClusterStateService.java:710) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:650) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:293) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:606) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:593) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:561) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:484) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:186) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:849) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:282) [opensearch-2.11.1.jar:2.11.1]
	at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:245) [opensearch-2.11.1.jar:2.11.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]

This appears to leave a Replica shard perpetually in the state of INITIALIZING:

ubuntu@ip-10-0-252-254:~$ curl -s -XGET "http://******:*******@10.0.1.146:9200/_cat/recovery?active_only=true&v=true"
index          shard time  type stage source_host source_node target_host target_node repository snapshot files files_recovered files_percent files_total bytes bytes_recovered bytes_percent bytes_total translog_ops translog_ops_recovered translog_ops_percent
index_5 0     57.4m peer init  10.0.1.204  10.0.1.204  10.0.1.146  10.0.1.146  n/a        n/a      0     0               0.0%          0           0     0               0.0%          0           -1           0                      -1.0%
ubuntu@ip-10-0-252-254:~$

Without any obvious causes of why:

ubuntu@ip-10-0-252-254:~$ curl -s -XGET "http://******:*******@10.0.1.146:9200/_cat/allocation?v"
shards disk.indices disk.used disk.avail disk.total disk.percent host       ip         node
    16       29.4gb      16gb     32.2gb     48.2gb           33 10.0.1.6   10.0.1.6   10.0.1.6
    15       22.5gb       9gb     39.2gb     48.2gb           18 10.0.1.204 10.0.1.204 10.0.1.204
    17       36.7gb    16.1gb     32.1gb     48.2gb           33 10.0.1.146 10.0.1.146 10.0.1.146
ubuntu@ip-10-0-252-254:~$

ubuntu@ip-10-0-252-254:~$ curl -s -XGET "http://******:*******@10.0.1.146:9200/_cluster/allocation/explain?pretty"
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "unable to find any unassigned shards to explain [ClusterAllocationExplainRequest[useAnyUnassignedShard=true,includeYesDecisions?=false]"
  },
  "status" : 400
}
ubuntu@ip-10-0-252-254:~$

Related component

Storage:Snapshots

To Reproduce

This isn't exactly trivial to reproduce, there seems to be something else involved that causes the problem. Here are the steps taken to arrive at the current-state however:

  1. Create a OpenSearch multi-node cluster.
  2. Index some data into an index.
  3. Setup Searchable Snapshots & create one for the index from the previous step: https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/snapshots/searchable_snapshot/#create-a-searchable-snapshot-index
  4. Restart the OpenSearch cluster.

Expected behavior

The expected behavior is for all shards to be successfully recovered upon restart without operations that result in a Yellow-state (e.g. orphaned replica-shards).

Additional Details

Plugins Please list all plugins currently enabled.

opensearch-alerting
opensearch-anomaly-detection
opensearch-asynchronous-search
opensearch-cross-cluster-replication
opensearch-custom-codecs
opensearch-geospatial
opensearch-index-management
opensearch-job-scheduler
opensearch-knn
opensearch-ml
opensearch-neural-search
opensearch-notifications
opensearch-notifications-core
opensearch-observability
opensearch-performance-analyzer
opensearch-reports-scheduler
opensearch-security
opensearch-security-analytics
opensearch-sql
prometheus-exporter
repository-s3

Screenshots If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS] Debian
  • Version [e.g. 22] Bullseye
  • OpenSearch: 2.11.1

Additional context This might be an instance of the problem mentioned in https://github.com/opensearch-project/OpenSearch/pull/11563#issuecomment-1857798490.

etgraylog avatar Feb 15 '24 17:02 etgraylog

Thanks @etgraylog! This does indeed look like the issue fixed by #11563. That fix is included in 2.12, which will be released in the coming week. Will you be able to pick up that release and test this?

andrross avatar Feb 15 '24 17:02 andrross

Thanks @etgraylog! This does indeed look like the issue fixed by #11563. That fix is included in 2.12, which will be released in the coming week. Will you be able to pick up that release and test this?

Thanks @andrross ! Certainly, I'll stay tuned 👍

etgraylog avatar Feb 15 '24 19:02 etgraylog

Thanks @etgraylog! This does indeed look like the issue fixed by #11563. That fix is included in 2.12, which will be released in the coming week. Will you be able to pick up that release and test this?

Thanks @andrross ! Certainly, I'll stay tuned 👍

With 2.12.0 I'm not able to reproduce the issue so far, It seems the fix is working @andrross. Thanks again!

etgraylog avatar Feb 22 '24 03:02 etgraylog