opensearch-devops icon indicating copy to clipboard operation
opensearch-devops copied to clipboard

[Help Needed] OpenSearch/Dashboards in Kubernetes Operator

Open peterzhuamazon opened this issue 2 years ago • 17 comments

[Help Needed] OpenSearch/Dashboards in Kubernetes Operator. We invite communities to help us adding more distros of OpenSearch/Dashboards. We recently have Helm OpenSearch available with the help of community, and we would love more community contributions.

peterzhuamazon avatar Jul 23 '21 17:07 peterzhuamazon

@peterzhuamazon Same point as #14

What operator? My interpretation is this means someone develops an operator that interacts with the Kubernetes API to deploy an instance of Opensearch.

Is this what we mean here?

DandyDeveloper avatar Jul 26 '21 07:07 DandyDeveloper

I guess @peterzhuamazon means Kubernetes Operator Support here. If that is the case we should change the issue title in my opinion.

TheAlgo avatar Jul 27 '21 07:07 TheAlgo

@TheAlgo What's the Kubernetes Operator? I don't know what that is. An operator is a written piece of software that interacts with the Kubernetes API to perform CRUD operations on resources (Maybe based on a CRD).

Can you link the Kubernetes Operator?

DandyDeveloper avatar Jul 27 '21 07:07 DandyDeveloper

@DandyDeveloper Nvm, I guess I misinterpreted your previous comment 😅 , operator is same as Kubernetes Operator , it is just how some people call it. Nothing differently. Example You can get more details about the discussion about the operator for OpenSearch here.

TheAlgo avatar Jul 27 '21 10:07 TheAlgo

@TheAlgo Your link on the second part is wrong

DandyDeveloper avatar Jul 27 '21 11:07 DandyDeveloper

@TheAlgo Your link on the second part is wrong

Updated the link 😄

TheAlgo avatar Jul 27 '21 13:07 TheAlgo

@TheAlgo Yeah, I agree with everything discussed in that thread. Operators are a fair bit of work though.

That being said, defining a CRD then having an operator act on it, that could in itself be backed by a Helm chart, similar to how istioctl deploys Istio via the Istio Operator.

So effectively, you have an image with helm backed into as well as the charts we're deploying (As well as the operator binary), and the values for the Helm chart are populated by the CRD people define.

E.G:

kind: OpensearchCluster
apiVersion: xyz
spec:
  clusterName: abc
  nodes:
  - name: data-1 
    type: data
  - name: master-1
    type: master
  - name: master-2
    type: master

Which could do something like;

helm template ./opensearch --set node.data.replicas=2 | kubectl create -f -

helm template ./os-dashboard --set abc=123 | kubectl create -f -

Not the perfect example, because you'd also want some form of state management / mechanism to follow resource deployment and state.

But this could be a good skeleton for what we want. CC: @peterzhuamazon @dblock

DandyDeveloper avatar Jul 27 '21 13:07 DandyDeveloper

Yes @DandyDeveloper it is about Kubenetes Operators as @TheAlgo pointed out.

peterzhuamazon avatar Jul 27 '21 20:07 peterzhuamazon

as pointed out in the forum the best solution would probably be if zalando would move zalando-incubator/es-operator to the Apache 2.0, then we could base on that. this is already a full-fledged, working operator. it's just not compatible with OpenSearch. @jkowall had contacted them zalando-incubator/es-operator#169 but it seems that they haven't replied back anymore since May. i've just updated the ticket again. note that the zalando operator has one big caveat which we'd have to solve when we base on that:

The operator does not manage Elasticsearch master nodes obviously we also want the operator to manage the master nodes!

elastic also has ECK (obviously under the wrong license; and this has never been Apache 2.0 in the past, so we can't just fork that): elastic/cloud-on-k8s which also works very nice (but is of course only compatible with ES with x-pack).

i have no experience with implementing an operator, but i'm not sure if "just" using helm charts in the background is enough. an operator must handle many things, incl. running rolling upgrades (while respecting things like PDBs, handling shard allocation, etc.), managing the health of nodes, handling TLS certificates (i believe ECK acts as an intermediate CA and issues certificates for the nodes so that each node has its own set of certificates), handling the configuration, etc. and of course it'd be great if it could also handle dynamic load-based runtime scaling (this is e.g. what the zalando operator can do; i'm not aware of ECK having this feature).

rursprung avatar Jul 28 '21 07:07 rursprung

but i'm not sure if "just" using helm charts in the background is enough

@rursprung It's not enough, it's nowhere near enough, but it's a basis for having versioned releases and then within the operator you can manage state and resource status.

Everything you've mentioned is great but also I'd argue that's the "final" look of the operator. You don't need all of that to start using the operator. Having a v1 beta that does "most" of the deployment and lifecycle of resources would be a good starting point.

But yes, you're definitely right. If Zolando opened up the license, it could be forked (or developed alongside?), then adjusted to suit Opensearch, it'd be a massive time saver.

DandyDeveloper avatar Jul 28 '21 08:07 DandyDeveloper

as pointed out in the forum the best solution would probably be if zalando would move zalando-incubator/es-operator to the Apache 2.0, then we could base on that. this is already a full-fledged, working operator. it's just not compatible with OpenSearch. @jkowall had contacted them zalando-incubator/es-operator#169 but it seems that they haven't replied back anymore since May. i've just updated the ticket again. note that the zalando operator has one big caveat which we'd have to solve when we base on that:

The operator does not manage Elasticsearch master nodes obviously we also want the operator to manage the master nodes!

elastic also has ECK (obviously under the wrong license; and this has never been Apache 2.0 in the past, so we can't just fork that): elastic/cloud-on-k8s which also works very nice (but is of course only compatible with ES with x-pack).

i have no experience with implementing an operator, but i'm not sure if "just" using helm charts in the background is enough. an operator must handle many things, incl. running rolling upgrades (while respecting things like PDBs, handling shard allocation, etc.), managing the health of nodes, handling TLS certificates (i believe ECK acts as an intermediate CA and issues certificates for the nodes so that each node has its own set of certificates), handling the configuration, etc. and of course it'd be great if it could also handle dynamic load-based runtime scaling (this is e.g. what the zalando operator can do; i'm not aware of ECK having this feature).

May I ask why isn't MIT license OK to start an operator based on zolando's one? Why is Apache 2 mandatory?

madalinignisca avatar Dec 21 '21 22:12 madalinignisca

as pointed out in the forum the best solution would probably be if zalando would move zalando-incubator/es-operator to the Apache 2.0, then we could base on that. this is already a full-fledged, working operator. it's just not compatible with OpenSearch. @jkowall had contacted them zalando-incubator/es-operator#169 but it seems that they haven't replied back anymore since May. i've just updated the ticket again. note that the zalando operator has one big caveat which we'd have to solve when we base on that:

The operator does not manage Elasticsearch master nodes obviously we also want the operator to manage the master nodes!

elastic also has ECK (obviously under the wrong license; and this has never been Apache 2.0 in the past, so we can't just fork that): elastic/cloud-on-k8s which also works very nice (but is of course only compatible with ES with x-pack).

i have no experience with implementing an operator, but i'm not sure if "just" using helm charts in the background is enough. an operator must handle many things, incl. running rolling upgrades (while respecting things like PDBs, handling shard allocation, etc.), managing the health of nodes, handling TLS certificates (i believe ECK acts as an intermediate CA and issues certificates for the nodes so that each node has its own set of certificates), handling the configuration, etc. and of course it'd be great if it could also handle dynamic load-based runtime scaling (this is e.g. what the zalando operator can do; i'm not aware of ECK having this feature).

There's also one in https://github.com/openshift/elasticsearch-operator

davidkarlsen avatar Dec 22 '21 19:12 davidkarlsen

There is also a new operator as mentioned here: https://discuss.opendistrocommunity.dev/t/kubernetes-operator-support-for-the-fork/5267/26

I think this is the operator: https://github.com/Opster/opensearch-k8s-operator

rpahli avatar Dec 23 '21 08:12 rpahli

and it seems that there's the next one: https://discuss.opendistrocommunity.dev/t/opensearch-kubernetes-operator/8106 which can be found here: https://github.com/rancher-sandbox/opni-opensearch-operator

rursprung avatar Dec 23 '21 15:12 rursprung

Sorry for the lack of updates. The Zalando team did talk to me about it and were working on changing the licnese. They got the approval. I setup a conversation between the AWS PMs and the folks at Zalando and the AWS team is working on how we would bring this in. There is also the early stages of the Opster operator too. The Zalando operator is more advanced IMO, but still missing a lot of what we would build into an operator.

Opni looks basic too, but it is not Apache licensed.

jkowall avatar Dec 29 '21 01:12 jkowall

May I ask why isn't MIT license OK to start an operator based on zolando's one? Why is Apache 2 mandatory?

it'd be good if all projects around opensearch would have the same license - this'll lessen the headaches when dealing with the whole license discussion (you only need to have it once, PR templates look the same, etc.). for some people it might also be ideological. (most of us probably don't have any issue using an MIT licensed software)

The Zalando operator is more advanced IMO, but still missing a lot of what we would build into an operator.

the most important lacking feature is probably the fact that it doesn't support managing master nodes

Opni looks basic too, but it is not Apache licensed.

are you sure? their LICENSE is clearly an Apache 2.0 license.

rursprung avatar Dec 29 '21 08:12 rursprung

We have been working closely with Opster team regarding k8s operator for OpenSearch. https://github.com/Opster/opensearch-k8s-operator/

Helm Chart link for the operator: https://artifacthub.io/packages/helm/opensearch-operator/opensearch-operator

Please try the operator and share us the feedback, Operator now supports both 1.x and 2.x versions of OS and OSD. Thank you

@idanl21 @dbason @swoehrl-mw @segalziv

prudhvigodithi avatar Aug 02 '22 19:08 prudhvigodithi