elasticsearch-operator icon indicating copy to clipboard operation
elasticsearch-operator copied to clipboard

v2 proposal

Open gianrubio opened this issue 7 years ago • 16 comments

v2 proposal

Motivation

The current definition of object like nodes, clients and addons (kibana or cerebro) didn't have the ability to customise specific objects. For instance the field resources is applied for master, nodes and clients, however clients doesn't need the same ammount of cpu and memory from master, another examples are: java-options,volume-size and regions. Besides that is not possible to customise labels, affinity, annotations and tolerations.

Benefits

  • Refactoring the clusterSpec definition for statefullset objects: master and data nodes, deployments: clients nodes, kibana and cerebro will make it easier for operators to customize objects.
  • Follow k8s api spec definition for: Affinity, Resources, VolumeClaimTemplate, Tolerations, ImagePullSecrets

Spec definition


...
import (
	v1 "k8s.io/api/core/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

...
type NodeSpec struct {

	Annotations map[string]string `json:"annotations,omitempty"`

	Affinity *v1.Affinity `json:"affinity,omitempty"`  // use k8s api object

	Image string `json:"image"`

	JavaOptions string `json:"javaOptions"`

	NetworkHost string `json:"networkHost"`

	NodePort int32 `json:"nodePort"`

	NodeSelector map[string]string `json:"nodeSelector,omitempty"`

	Replicas int32 `json:"replicas"`

	Resources v1.ResourceRequirements `json:"resources,omitempty"` // use k8s api object

	ServiceAccountName string `json:"serviceAccountName,omitempty"`

	VolumeClaimTemplate v1.PersistentVolumeClaim `json:"volumeClaimTemplate,omitempty"` # previous object was storage

	Tolerations []v1.Toleration `json:"tolerations,omitempty"`  // use k8s api object

	Zones []string `json:"zones,omitempty"`

}

// ClusterSpec defines cluster options
	type ClusterSpec struct {

	Snapshot Snapshot `json:"snapshot"`

	ImagePullSecrets string `json:"imagePullSecrets,omitempty"` // use k8s api object

	Instrumentation Instrumentation `json:"instrumentation"`

	Scheduler Scheduler

	KeepSecretsOnDelete bool `json:"keepSecretsOnDelete"`

	UseSSL bool `json:"useSsl"`

	KibanaSpec NodeSpec `json:"kibanaSpec"` // BEGIN  ** use generic spec for all this objects

	CerebroSpec NodeSpec `json:"cerebroSpec"`

	MasterSpec NodeSpec `json:"masterSpec"`

	ClientSpec NodeSpec `json:"clientSpec"`

	DataSpec NodeSpec `json:"dataSpec"` //END
}

Nice to have (not mandatory to launch v2 )

Support cluster join between v1 to v2

Create statefullset object with a new name pattern, using the same service discovery, this allow the new nodes from v2 to join the cluster on v1, syncing indexes and shards. It will require a manual action to delete old statefullsets.

Deploy fluentd daemsonset

Create fluentd daemsonset to automatically push logs from nodes to es. In my clusters I use helm to deploy fluentd but it would be even better if the operator deploys it.

note: feedbacks are welcome

#163

gianrubio avatar Apr 25 '18 17:04 gianrubio

Should we split objects out? So create a master crd and client crd? Not sure if that is too verbose, just a pattern I've seen I'm other projects.

stevesloka avatar May 13 '18 14:05 stevesloka

Yes, make sense to me to split out. Should we split master and client or all the objects?

Ex.

Giancarlos-MBPro:Downloads grubio$ kubectl  get crd
NAME                                          AGE
master.elasticsearchclusters.enterprises.upmc.com    41d
client.elasticsearchclusters.enterprises.upmc.com    41d
kibana.elasticsearchclusters.enterprises.upmc.com    41d
cerebro.elasticsearchclusters.enterprises.upmc.com    41d
fluentd.elasticsearchclusters.enterprises.upmc.com    41d

gianrubio avatar May 14 '18 09:05 gianrubio

Hi, would it be possible to split the components external to Elasticsearch server(Kibana/Cerebro) to separate CRDs, but not split core Elasticsearch components master/client/data?

My logic here is that Elasticsearch cluster needs to be treated as a single entity. separate master/data CRD won't tell much about the cluster.

/cc @jcantrill @portante @richm @ewolinetz

t0ffel avatar May 18 '18 13:05 t0ffel

I think we should not split them out into separate resources but keep them together like you have in the current spec. Potentially we could do that with snapshots which might make that a bit cleaner, but for not that doesn't buy us much.

  • For NodeSelector & Zones should we combine those into the same struct? They sort of have a similar effect by spreading work out across nodes
  • UseSSL I think should be global to the cluster, not tied to a NodeSpec. You shouldn't be able to turn on SSL for master nodes and not client nodes.
  • We should define the default values in the spec so we're clear on what happens if you don't define Affinity or JavaOptions, etc
  • What is the Annotations field for?
  • ImagePullSecrets I don't think should be an array, would someone define multiple in this spec?
  • Should ServiceAccountName be applied to the cluster, not the nodes?
  • Not sure I understand the VolumeClaimTemplate, the Operator creates the statefulsets which then generate the claims.

stevesloka avatar May 23 '18 13:05 stevesloka

@t0ffel How does this jive with your proposed operator changes? We should either actively contribute to this proposal now or starting pushing the ideas we have elsewhere

jcantrill avatar May 23 '18 14:05 jcantrill

To be more clear @jcantrill, what I meant by "not splitting out" is having a separate CRD for each component of the cluster.

stevesloka avatar May 23 '18 14:05 stevesloka

@stevesloka I wasn't really questioning your suggestions but more advising @t0ffel that we (openshift team) need to determine if the direction of this operator meets our needs and we should participate here or go forth on our own. I dont know if there is enough detail in this issue to make that determination but we have some aggressive timelines we wish to meet and I'd like to see things progress....faster.

jcantrill avatar May 23 '18 14:05 jcantrill

Sure thing @jcantrill, I just reread my earlier comments and thought they weren't super clear. =)

stevesloka avatar May 23 '18 14:05 stevesloka

For NodeSelector & Zones should we combine those into the same struct? They sort of have a similar effect by spreading work out across nodes

Agree, I'd propose to use k8s nomenclature so rather than the default name zones use NodeSelector, wdyt?

UseSSL I think should be global to the cluster, not tied to a NodeSpec. You shouldn't be able to turn on SSL for master nodes and not client nodes.

Agree, on my proposal it's a part of ClusterSpec

We should define the default values in the spec so we're clear on what happens if you don't define Affinity or JavaOptions, etc

Agree, I'm gonna write what are the defaults

What is the Annotations field for?

For instance when you use kube-iam you should specify annotations for pod to assume IAM roles, it's required for master and data nodes but not for client nodes. Cronjob for snapshots also require this annotation, however kibana and cerebro doesn't need them.

ImagePullSecrets I don't think should be an array, would someone define multiple in this spec?

Agree, it's fixed in the proposal

Should ServiceAccountName be applied to the cluster, not the nodes?

No, it's the same case of the annotations. Nodes, master and clients should connect to k8s api to use discovery service but kibana and cerebro shouldn't be allowed to talk to k8s api.

Not sure I understand the VolumeClaimTemplate, the Operator creates the statefulsets which then generate the claims.

It's an elegant way to use k8s object to specify Volume and to allow users to customise volume by nodeSpec

The current volume definition is:

....
  data-volume-size: 10Gi
  storage:
    type: gp2
    storage-class-provisioner: kubernetes.io/aws-ebs 

with the new proposal users will have something like this:

  storage:
    volumeClaimTemplate:
      spec:
        storageClassName:  my-ssd-class
        resources:
          requests:
            storage: 30Gi

https://v1-6.docs.kubernetes.io/docs/api-reference/v1.6/#persistentvolumeclaim-v1-core

gianrubio avatar Jun 01 '18 12:06 gianrubio

From my point of view the data stateful set should be per zone. Combined with a cluster.routing.allocation.awareness we could upgrade one zone with a newer version and rollback without loosing data.

runningman84 avatar Nov 21 '18 10:11 runningman84

Need VolumeClaimTemplate for baremetal Is any progress available?

dimm0 avatar Feb 02 '19 00:02 dimm0

How can I help implementing this?

splisson-altair avatar Feb 18 '19 23:02 splisson-altair

also happy to help. Is there a branch for v2 we can start developing? Not being able to use kube-iam is bad for me

cortopy avatar Feb 28 '19 13:02 cortopy

I think we should get a design down in a google doc, then figure out where to go from there. I made one here (https://docs.google.com/document/d/1eigzZ3pcNElItsSk6xoWcDDRHvJzseMMcX8r_YT6XNk/edit?usp=sharing). If someone wants to draft up the design started above, I think we can then lock the doc and review together with comments.

// cc @while1eq1

stevesloka avatar Mar 01 '19 15:03 stevesloka

I started filling in the doc using elements from the discussion that @gianrubio had posted here.

splisson-altair avatar Mar 04 '19 19:03 splisson-altair

Openshift is close to releasing our first version of an elasticsearch-operator [1] for our pending 4.0 drop. It is currently coupled specifically to Openshift cluster logging usecases but has the intention of being more general purpose.

[1] https://github.com/openshift/elasticsearch-operator

jcantrill avatar Mar 05 '19 16:03 jcantrill