dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Feature][Registry] Remove dependency on ZooKeeper when deployed in K8S

Open EricGao888 opened this issue 2 years ago • 4 comments

Search before asking

  • [X] I had searched in the issues and found no similar feature requirement.

Description

  • Add a new registry plugin for DS so that DS will no longer need to depend on ZooKeeper / MySQL / ETCD for service discovery and distributed lock when deployed in K8S.
  • Service Events - K8S Watch - We may use Fabric8 Kubernetes Java Client Informer API for better stability and less pressure on K8S api server.
  • HeartBeat Instead of making services (masters / workers) persist their heartbeats, we could try to use k8s top to query related metrics, however, DS uses heartbeat to transmit some server runtime data besides metrics. Maybe a better way to transmit heartbeat is to put heartbeat data into pod environment variables / files. Alternatively, we could create a configmap for each master / worker to maintain the heartbeat data.
try (InputStream is = client.pods().inNamespace(currentNamespace).withName(pod1.getMetadata().getName()).file("/msg").read())  {
  String result = new BufferedReader(new InputStreamReader(is)).lines().collect(Collectors.joining("\n"));
}
  • Service Discovery - K8S Pod Selector / K8S Endpoints Controller
  • Master Fault Tolerance (Distributed Lock) - K8S Resource Annotation - Masters will attempt to put its pod ip into the metadata.annotations of a resource and the one which does this successfully will be the leader to perform failover operations. - Follow-up: In fabric k8s java client, there have been implemented leader election apis using resource lock (lease, configmap, etc.). Therefore, we do not need to care too much about the details on the implementation of this distributed lock. We would make it configurable and users could choose between lease and configmap for resource lock.
  • Two examples for java k8s client for leader election: example with kubernetes-client and example with fabric8 kubernetes-client

Design

image

Security

  • Use Kubernetes RBAC for security purpose. Users could control permissions on workers' access to Kubernetes APIs.

Use case

Already described above.

Related issues

None

Are you willing to submit a PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

EricGao888 avatar Jan 28 '23 07:01 EricGao888