elasticsearch-operator icon indicating copy to clipboard operation
elasticsearch-operator copied to clipboard

zone awareness

Open stevesloka opened this issue 7 years ago • 10 comments

Need to implement zone awareness so that primary & replica shards are not all scheduled into the same zone: https://www.elastic.co/guide/en/elasticsearch/reference/current/allocation-awareness.html

stevesloka avatar Aug 23 '17 13:08 stevesloka

Reference: https://github.com/pires/docker-elasticsearch/pull/36

stevesloka avatar Aug 23 '17 13:08 stevesloka

Yeah, this would be a major plus for ES operator (and something that sort of prevents us from using ES operator over at @sematext). Is this something that you might be adding in the next few weeks?

I see you references pires/docker-elasticsearch#36 but note there is also https://github.com/pires/kubernetes-elasticsearch-cluster which seems to have Pod anti-affinity. Not the same as zone-awareness, but conceptually similar.

otisg avatar Nov 11 '17 23:11 otisg

Let me see what I can do. 😀

stevesloka avatar Nov 13 '17 15:11 stevesloka

The definition of what a zone is something the user must provide as it depends upon their deployment environment and topology. I would suggest passing through the the shared awareness from the CRD to Elasticsearch to keep it generic.

I see you references pires/docker-elasticsearch#36 but note there is also https://github.com/pires/kubernetes-elasticsearch-cluster which seems to have Pod anti-affinity. Not the same as zone-awareness, but conceptually similar.

Pod anti-affinity is something much different than the awareness in ES. Awareness is for shards (node agnostic) where as pod anti-affinity would apply to not running more than one node on a particular host.

djschny avatar Nov 13 '17 15:11 djschny

That's a good point @djschny, right now we have a zone feature in the crd, but that applies to and AWS environment.

@otisg, are you deploying to AWS? If not we will need some work to make the storage class piece more generic for non AWS environments first.

stevesloka avatar Nov 13 '17 15:11 stevesloka

@stevesloka Yes, to AWS (EC2+EBS). Are you saying ES Operator already supports this?

otisg avatar Nov 13 '17 18:11 otisg

Right now it will distribute the data nodes across the zones, but nothing enforces Elastic to put the right distribution of replicas across those zones, that was the intent of this issue, to give elastic knowledge of the zones you're deployed to so it can do that extra bit of work.

stevesloka avatar Nov 14 '17 03:11 stevesloka

Is this still an ongoing effort? This feature would be very interesting!

vroudge avatar Feb 06 '19 00:02 vroudge

This is definitely something that should be implemented. A workaround is to set the number of replicas of each shard to n+1 where n = the number of data nodes in each failure domain, but this is not ideal as it requires manually updating if the cluster is scaled up, may increase load from rebalances in the event of a zone failure/partition and still introduces an element of vulnerability in that it is possible for all copies but one to end up in the same zone, and ideally something that should be done transparently to the user. As data nodes are managed in each zone with a statefulSet, it would be relatively simple to add this as the failure domain when starting elasticsearch.

jacobreid avatar Mar 26 '19 14:03 jacobreid

I noticed that the docker image already has support for shard allocation awareness (https://github.com/while1eq1/elasticsearch-kubernetes-searchguard/blob/master/run.sh#L39), although this is not implemented in the best way - it uses an environment variable SHARD_ALLOCATION_AWARENESS_ATTR which is a path to a file that contains the attribute, intended for a system hostname or similar but not ideal for doing it at a higher level such as an EC2 availability zone.

jacobreid avatar Mar 26 '19 15:03 jacobreid