cloud-on-k8s icon indicating copy to clipboard operation
cloud-on-k8s copied to clipboard

Operator doesn't restart pod in yellow state cluster even if it has a list for yellow-state

Open ccaillet1974 opened this issue 1 year ago • 2 comments

Bug Report

What I do I change manifests for upgrading version from 8.8.2 to 8.9.0

What I expect Pods must restart in sequence for applying the new version even if the ES cluster is in yellow state by using the pods listed in the "if_yellow_only_restart_upgrading_nodes_with_unassigned_replicas" list

What is the actual behaviour Operator is waiting for the ES cluster to be in green state

Environment

  • ECK version: 2.9.0+f24ccc37

  • Kubernetes information:

    • On premise : deployed with kubespray 2.22.1 on Debian 11 linux
Client Version: v1.24.6
Kustomize Version: v4.5.4
Server Version: v1.25.6
  • Logs:
{"log.level":"info","@timestamp":"2023-08-14T11:29:24.894Z","log.logger":"elasticsearch-controller","message":"Starting reconciliation run","service.version":"2.9.0+f24ccc37","service.type":"eck","ecs.version":"1.4.0","iteration":"90408","namespace":"cdn-bigdata","es_name":"sophie-eck"}
{"log.level":"info","@timestamp":"2023-08-14T11:29:24.901Z","log.logger":"elasticsearch-controller","message":"Updating resource","service.version":"2.9.0+f24ccc37","service.type":"eck","ecs.version":"1.4.0","iteration":"90408","namespace":"cdn-bigdata","es_name":"sophie-eck","kind":"Service","namespace":"cdn-bigdata","name":"sophie-eck-es-http"}
{"log.level":"info","@timestamp":"2023-08-14T11:29:25.184Z","log.logger":"elasticsearch-controller","message":"Ensuring no voting exclusions are set","service.version":"2.9.0+f24ccc37","service.type":"eck","ecs.version":"1.4.0","iteration":"90408","namespace":"cdn-bigdata","es_name":"sophie-eck","namespace":"cdn-bigdata","es_name":"sophie-eck"}
{"log.level":"info","@timestamp":"2023-08-14T11:29:25.505Z","log.logger":"elasticsearch-controller","message":"Cannot restart some nodes for upgrade at this time","service.version":"2.9.0+f24ccc37","service.type":"eck","ecs.version":"1.4.0","iteration":"90408","namespace":"cdn-bigdata","es_name":"sophie-eck","namespace":"cdn-bigdata","es_name":"sophie-eck","failed_predicates":{"data_tier_with_higher_priority_must_be_upgraded_first":["sophie-eck-es-data-hot-0","sophie-eck-es-data-hot-1","sophie-eck-es-data-hot-10","sophie-eck-es-data-hot-11","sophie-eck-es-data-hot-2","sophie-eck-es-data-hot-3","sophie-eck-es-data-hot-4","sophie-eck-es-data-hot-5","sophie-eck-es-data-hot-6","sophie-eck-es-data-hot-7","sophie-eck-es-data-hot-8","sophie-eck-es-data-hot-9","sophie-eck-es-data-warm-0","sophie-eck-es-data-warm-1","sophie-eck-es-data-warm-10","sophie-eck-es-data-warm-11","sophie-eck-es-data-warm-12","sophie-eck-es-data-warm-13","sophie-eck-es-data-warm-14","sophie-eck-es-data-warm-15","sophie-eck-es-data-warm-16","sophie-eck-es-data-warm-17","sophie-eck-es-data-warm-2","sophie-eck-es-data-warm-3","sophie-eck-es-data-warm-4","sophie-eck-es-data-warm-5","sophie-eck-es-data-warm-6","sophie-eck-es-data-warm-7","sophie-eck-es-data-warm-8","sophie-eck-es-data-warm-9"],"if_yellow_only_restart_upgrading_nodes_with_unassigned_replicas":["sophie-eck-es-data-cold-4","sophie-eck-es-master-0","sophie-eck-es-master-1","sophie-eck-es-master-2","sophie-eck-es-ml-0","sophie-eck-es-ml-1","sophie-eck-es-transforms-0","sophie-eck-es-transforms-1","sophie-eck-es-transforms-2"]}}
{"log.level":"info","@timestamp":"2023-08-14T11:29:25.506Z","log.logger":"elasticsearch-controller","message":"Ending reconciliation run","service.version":"2.9.0+f24ccc37","service.type":"eck","ecs.version":"1.4.0","iteration":"90408","namespace":"cdn-bigdata","es_name":"sophie-eck","took":0.611674986}

ccaillet1974 avatar Aug 14 '23 11:08 ccaillet1974

Is there any relocating or initializing shards?

barkbay avatar Aug 17 '23 07:08 barkbay

Globally yes but for exemple ML (dedicated) nodes or transforms (dedicated) nodes which have no data roles could be restarted but they're not.

I know that green status is when unassigned and initialized shards have a value of 0.

If I read correctly the message for the rule "if_yellow_only_restart_upgrading_nodes_with_unassigned_replicas" those nodes haven't any unassigned shards and could be restarted or at least sophie-eck-es-ml (dedicated ML without data role), sophie-eck-es-transform (dedicated transform role without data role) and sophie-eck-es-master (dediacted master without data role)

ccaillet1974 avatar Aug 22 '23 14:08 ccaillet1974