alluxio icon indicating copy to clipboard operation
alluxio copied to clipboard

Better support exposing the Alluxio service to external the K8s cluster

Open jiacheliu3 opened this issue 2 years ago • 6 comments

Is your feature request related to a problem? Please describe. It's becoming more common that users want to host the Alluxio service on K8s while some external applications need to access the Alluxio cluster from outside the K8s cluster.

In the current state, the users need to:

  1. Change the master K8s services to be accessible from outside the K8s cluster. This typically changes the service to NodePort or Ingress.
  2. Somehow expose the worker pods to the external. This is much harder than 1 because worker pods are dynamic and do not have associated Services. One way is to use hostNetwork=true for all workers and clients will then talk to worker nodes.

Describe the solution you'd like We need one solution for:

  1. Enabling master pods to be accessible from outside
  2. Enabling worker pods to be accessible from outside
  3. Ideally use only one switch to control all

The biggest challenge is the worker pods. Using a combination of StatefulSet deployed workers + externalTrafficPolicy Service can be a solution. The Service maps to the worker pod by name, which becomes deterministic because workers are now deployed with StatefulSet.

apiVersion: v1
kind: Service
metadata:
  name: worker-0
spec:
  type: NodePort
  externalTrafficPolicy: Local
  selector:
    statefulset.kubernetes.io/pod-name: worker-0
  ports:
  - protocol: TCP
    port: 19998
    targetPort: 19998

The worker pods now need anti affinity defined, so no two worker pods appear on one node.

The master pods can be exposed similarly.

Describe alternatives you've considered Use hostNetwork to deploy all master and worker pods and access the Alluxio pods by node IP. This is the cleanest way as of Alluxio v2.8. The challenge is hostNetwork requires admin privileges and may even incur port collision with other services.

Urgency MEDIUM. There are existing use cases for this setup.

Additional context Add any other context or screenshots about the feature request here.

jiacheliu3 avatar Jun 28 '22 10:06 jiacheliu3

@ZhuTopher @ssz1997 for visibility

jiacheliu3 avatar Jun 28 '22 10:06 jiacheliu3

I agree with the proposal to switch Alluxio workers to use StatefulSet. It seems that they are not as stateless/idempotent as we'd thought. Furthermore then the solution we use to expose Masters can be leveraged to expose Workers as well.

The main difficulty we currently have with exposing Alluxio to clients outside of k8s is that Workers register to the Master using their local hostname, which may not be resolvable to clients outside of k8s. I had previously proposed elsewhere that a possible solution would be through CoreDNS plugins:

  • configurable via k8s ConfigMap
  • Add the k8s_external plugin to allow specifying external IPs to k8s Services, point clients to CoreDNS as the authoritative nameserver for the domain

ZhuTopher avatar Jul 15 '22 00:07 ZhuTopher

@ZhuTopher The worker has alluxio.worker.hostname and alluxio.worker.container.hostname just to pass both the pod and node ip to the master (then the client). So the client will connect to the pod IP if any. I'm just trying to say that the worker reports those IPs to the master as much as it can, so you may change the client logic as you need. Just a thought before I look into your proposal, hope that helps.

However if the work pod is not visible via either the pod or host IP(no host port opened for it), then we need a new mechanism.

jiacheliu3 avatar Jul 15 '22 02:07 jiacheliu3

Oh that's right, I don't remember if this is already the case or not but we'd want to set the following in our helm chart: alluxio.worker.hostname=status.hostIP and alluxio.worker.container.hostname=status.podIP

  • K8s doc ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#podstatus-v1-core

That works for the worker addresses only if the worker Pod(s) use hostPort to bind to the same port on the nodes. Also I don't recall if the master has that mechanism of supplying the "container" hostname?

ZhuTopher avatar Jul 15 '22 20:07 ZhuTopher

No the masters don't have that equivalent because we use Service to handle the name resolution. Clients talk to services so no need to know the pod names. But yea we currently don't have a uniformed definition of what hostnames map to which use cases (internal/external the k8s cluster etc). The existing configs are more on demand. If there's a chance to unify all those, I'm totally in :)

jiacheliu3 avatar Jul 16 '22 02:07 jiacheliu3

Solution should be independent whether hostNetwork is enabled or not.

Init container for workers should collect metadata and use an init script to talk to master and register themselves.

          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace

and

            - register-worker
            - --ip
            - $(POD_IP)
            - --k8s-namespace
            - $(POD_NAMESPACE)

Something like above.

This will work regardless I enabled hostNetwork or not.

      hostNetwork: {{ $hostNetwork }}
      hostPID: {{ $hostPID }}
      dnsPolicy: {{ .Values.worker.dnsPolicy | default ($hostNetwork | ternary "ClusterFirstWithHostNet" "ClusterFirst") }}

nirav-chotai avatar Jul 26 '22 02:07 nirav-chotai

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jan 31 '23 15:01 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Mar 03 '23 15:03 github-actions[bot]