linkerd2 icon indicating copy to clipboard operation
linkerd2 copied to clipboard

Add Support For Hybrid Windows/Linux Clusters

Open cartyc opened this issue 5 years ago • 24 comments

Feature Request

Linkerd to support hybrid kubernetes environments (Windows/Linux)

What problem are you trying to solve?

It would be great to be able to use Linkerd in hybrid cluster environments and have windows deployments as part of the mesh.

How should the problem be solved?

It would be great if linkerd proxy worked in a windows env.

Any alternatives you've considered?

I have not tried anything else at the moment.

How would users interact with this feature?

I would image it would remain the same as the current experience. At least I do not expect a major change, maybe add a windows flag to the installer to specify the ENV.

cartyc avatar Aug 21 '19 14:08 cartyc

@grampelberg anyone we can tag to give us the status of this effort?

wmorgan avatar Aug 21 '19 16:08 wmorgan

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 19 '19 16:11 stale[bot]

I just wondered how this is progressing - I'm also looking to deploy linkerd on a mixed mode cluster. What is the status of getting this to work on Windows nodes?

adamcarter81 avatar Dec 23 '19 16:12 adamcarter81

@adamcarter81 Windows still doesn't support it. Hopefully 1H 2020. There's great work going on, just kinda blocked on the actual networking support.

grampelberg avatar Dec 23 '19 16:12 grampelberg

How is the status of the hybrid cluster implementation? May I offer help?

k3daevin avatar Aug 08 '20 07:08 k3daevin

Any updates on the progress here? Or is it even planned? Now that Envoy support for windows is in GA, wondering what's the plan for Linkerd on this avenue.

beingamarnath avatar Jun 30 '21 05:06 beingamarnath

Any updates on the progress here? Or is it even planned? Now that Envoy support for windows is in GA, wondering what's the plan for Linkerd on this avenue.

LinkerD doesn't use Envoy, so that's kind of moot

RichiCoder1 avatar Jun 30 '21 15:06 RichiCoder1

LinkerD doesn't use Envoy, so that's kind of moot

Yes, I know LinkerD uses linkerd proxy. Just for the progress comparison I brought up about Envoy.. 😝

beingamarnath avatar Jun 30 '21 16:06 beingamarnath

What needs to be done to add Windows support to the LinkerD proxy?

Type1J avatar Dec 07 '21 18:12 Type1J

AIUI the issue is not running Linkerd2-proxy on Windows (which is not difficult) but the fact that Windows networking does not support the TCP redirecting that Linkerd's init-container requires. I heard a rumor that Microsoft was adding such support in 2021, but I haven't heard anything about it landing. We would need that support in order to make progress.

Note this is not Linkerd-specific. Any service mesh that uses iptables-style TCP redirecting has this same limitation on Windows today.

wmorgan avatar Dec 07 '21 19:12 wmorgan

Does anybody know of a link to where the progress of Windows TCP redirecting may be?

Type1J avatar Feb 10 '22 01:02 Type1J

I'm not really sure I'm understanding this correctly, but is the https://github.com/Microsoft/ebpf-for-windows XDP support the right sort of direction to be looking for the needed features? I guess it doesn't currently have the necessary hooks though.

TBBle avatar Feb 10 '22 10:02 TBBle

is the https://github.com/Microsoft/ebpf-for-windows XDP support the right sort of direction to be looking for the needed features?

Maybe? I'm not really sure how ipchains are implemented in Linux, but Linux does use eBPF in the kernel.

For those who don't know, eBPF is a virtual machine (like the JVM, but more like WASM) targeted by a language like C or Rust that allows network traffic to be "filtered" or controlled in some way. It's a VM to allow less-than-kernel-trusted code to run in isolation performing network filtering tasks.

I guess it doesn't currently have the necessary hooks though.

Does anybody know what would be needed here, or if this is even the right road to travel?

Type1J avatar Feb 10 '22 14:02 Type1J

While iptables isn't implemented using eBPF in Linux, you can implement iptables-equivalent functionality using eBPF in Linux, e.g. https://github.com/mbertrone/bpf-iptables. Cilium provides an eBPF-based replacement for kube-proxy, which is implemented on Linux using iptables (or legacy usermode 'bind the relevant socket and forward those packets').

The thing I'm not sure of is whether the eBPF in Windows is sufficient, i.e even if sufficient support exists in the Windows networking stack, is it exposed via eBPF on Windows yet? I assume the functionality needed is what's done by https://github.com/linkerd/linkerd2-proxy-init/blob/master/iptables/iptables.go.


I also noticed that kube-proxy supports a Windows Kernel feature "VFP" (it also has usermode support using netsh portproxy), but I'm assuming linkerd2-proxy actually needs more than this offers, since that's been in kube-proxy since 2017, so predates the comments about Windows lacking necessary features.

Or maybe VFP (and/or WFP) would be sufficient if they could operate on the relevant network flows, but the flows linkerd needs to redirect are not visible to these platforms as they are relatively internal compared to what kube-proxy manages.

VFP/WPF API for containers is visible at https://github.com/microsoft/hcsshim/blob/master/hcn/hcnpolicy.go for reference.

Since kube-proxy and CNI both run on the host on Windows, I assume that same setup would be needed for linkerd, i.e. it's really linkerd-cni, not linkerd-proxy-init that needs to be gotten working on Windows, since we can't really spawn a new host-side process for every Pod.

TBBle avatar Feb 11 '22 14:02 TBBle

Hey folks, MSFT person here. I just tried the tutorial on a mixed AKS cluster (Windows and Linux) with a sample IIS app that is currently served by a ngnix ingress and the step-by-step fails when launching the proxy-init. Here's the status:

linkerd-data-plane
------------------
√ data plane namespace exists
× data plane proxies are ready
    pod "iis-app-routing-58ff54f66b-lgdwm" status is Pending
    see https://linkerd.io/2.12/checks/#l5d-data-plane-ready for hints

Looking at the pods on the app namespace:

vinicius [ ~ ]$ kubectl get pods -n iissampleapp
NAME                                        READY   STATUS                  RESTARTS   AGE
iis-app-routing-58ff54f66b-lgdwm            0/2     Init:ImagePullBackOff   0          25m
iis-app-routing-5f85cb4b47-b9rn6            1/1     Running                 0          48m
keyvault-iis-app-routing-557c7bf745-87976   2/2     Running                 0          25m

Then, looking at the faulty pod:

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  25m                  default-scheduler  Successfully assigned iissampleapp/iis-app-routing-58ff54f66b-lgdwm to akswspool000001
  Normal   Pulling    23m (x4 over 25m)    kubelet            Pulling image "cr.l5d.io/linkerd/proxy-init:v2.0.0"
  Warning  Failed     23m (x4 over 25m)    kubelet            Error: ErrImagePull
  Normal   BackOff    26s (x108 over 25m)  kubelet            Back-off pulling image "cr.l5d.io/linkerd/proxy-init:v2.0.0"

Any ideas on when this will be fixed for Windows? We have customers using AKS with Windows pods, ingress for HTTPS traffinc, but looking at Service Mesh options and LinkerD would be a nice fit.

vrapolinario avatar Mar 21 '23 22:03 vrapolinario

As far as I knew a year ago, implementing linkerd (or any service mesh) on Windows depends on being able to inject things (container outgoing packet redirection like the nat table in Linux iptables) into the network stack that may not be exposed on Windows at this time.

TBBle avatar Mar 22 '23 02:03 TBBle

There's quite a few people wanting service meshes (LinkerD, Istio, or anything else), but the Windows container runtimes (and maybe Windows itself) hasn't exposed a way to do the networking manipulation needed for a service mesh to work. Istio's ambient mode looks more promising since the proxy is node-wide, not per-Pod. In theory, one could pair a Windows node with a Linux node that proxies the traffic, but that modus operandi isn't yet supported. I'm hoping LinkerD, in an effort to compete with ambient, does something similar soon (reducing load on the nodes due to service mesh activity), and fix this issue for Windows in the process.

Type1J avatar Apr 27 '23 15:04 Type1J

I'm not sure that ambient mode in particular would help, my understanding is that Windows does (or did) not expose the fundamental network operations needed to redirect traffic to the proxy; if the limitation were simply that we can't have the proxy in the same pod as the service, or we can't do the network setup from inside a container, then a linkerd-cni implementation on Windows could solve this as CNI already runs on the host (or a Host Process pod, which is equivalent) in Windows.

See also https://linkerd.io/2022/12/28/service-mesh-2022-recap-ebpf-gateway-api/

TBBle avatar Apr 28 '23 02:04 TBBle

See the "Redirection Policy Comparison" Iptables on linux , HNS policy on windows Time 07:40

Service Mesh using Envoy on Windows - S. Nanopoulos, P. Balasubramanian, K. Subramanian, N Jackson https://www.youtube.com/watch?v=ggvaAbjx4jo

And hcnproxyctl Host Container Networking Proxy Controller is a high-level library and executable that allows users to program layer-4 proxy policies on Windows through the Host Networking Service (HNS). It is intended to be used as part of a service mesh to redirect all traffic in a given network compartment through a sidecar proxy. https://github.com/microsoft/hcnproxyctrl

AndreaPQ avatar Jul 03 '23 09:07 AndreaPQ

Thanks @AndreaPQ! That's a great find.

wmorgan avatar Jul 06 '23 20:07 wmorgan

Thanks @AndreaPQ! That's a great find.

Any planned roadmap ?

AndreaPQ avatar Aug 24 '23 18:08 AndreaPQ

It is likely we are going to start looking at the effort involved in this sometime next year. Feel free to ping me on the Linkerd slack for more details.

wmorgan avatar Oct 02 '23 22:10 wmorgan

@wmorgan any update on this?

Kmdkca avatar Feb 07 '24 11:02 Kmdkca

We have done some initial groundwork for this feature. Support for Windows nodes continues to be on the roadmap for this year.

wmorgan avatar Apr 15 '24 14:04 wmorgan