tailscale FR: Tailscale as Kubernetes CNI

What are you trying to do?

It would enable use cases like cross-region/cross-cloud Kubernetes clusters, as well as "bursting into cloud" functionality for those running on-prem without worrying about firewalls and VPC peering.

On-prem clusters would also benefit by getting e2e encrypted tunnels between pods and nodes.

I think at the limit, even replacing node<->apiserver communication with Tailscale would be interesting as that would remove some of the complexities involved with PKI management.

Related discussion on Twitter

cc @kris-nova @mauilion

How should we solve this?

No response

What is the impact of not solving this?

No response

Anything else?

No response

Jun 06 '22 16:06 maisem

No more MTLS required!!! We're close from paradize! :) (+ebpf and we would have a fully featured replacement for service-meshes)

Jun 09 '22 20:06 trash-anger

Note that you already (as I do) run a multi-cloud cluster over tailscale :) I enable the tailscale network on the host, then run flannel with --flannel-iface=tailscale0

No doubt a CNI would make the setup easier and more reliable, though 👍

Jun 11 '22 08:06 koying

Note that you already (as I do) run a multi-cloud cluster over tailscale :) I enable the tailscale network on the host, then run flannel with --flannel-iface=tailscale0

No doubt a CNI would make the setup easier and more reliable, though 👍

I use this in my k8s cluster, but it doesn't work properly with Ingress nginx, ( not accessible externally, have you encountered this problem?

Jul 30 '22 09:07 fuergaosi233

Related, the Kubernetes Operator is available: https://github.com/tailscale/tailscale/issues/502#issuecomment-1414522795

Feb 13 '23 05:02 DentonGentry

There is other prior art (from two years ago) in:

Jun 13 '23 17:06 almereyda

K3s now has experimental integration with Tailscale, which automatically advertise pod IP cidr as the subnet router. By default, server (master) nodes still must be reachable to each other directly via their private IPs in a HA setup with embedded etcd. Though adding a --flannel-iface=tailscale0 option could further route server (master) <-> server (master) traffic over tailscale, which is not recommended by K3s since the performance issue.

I didn't get it to work with --flannel-iface=tailscale0 --flannel-backend=host-gw and manually set up subnet router to pod IPs in K3s. It seemed that Flannel would try to add the same route as Tailscale did, which caused some confilct, so VXLAN-over-Tailscale was still needed in that case. The K3s integration with Tailscale uses a custom Tailscale extension backend, and no flannel.1 VTEP interface is created. Per my understanding it should be similar to host-gw, though routing table is managed by Tailscale instead of Flannel, and VXLAN overhead is eliminated.

Sep 30 '23 03:09 ky-bd

Note that you already (as I do) run a multi-cloud cluster over tailscale :) I enable the tailscale network on the host, then run flannel with --flannel-iface=tailscale0

No doubt a CNI would make the setup easier and more reliable, though 👍

@koying would you be able to provide instructions to get this working with Flannel? I have tried Flannel using this blog but end up with IP conflicts in the Pod overlay network. Very similar issues with Calico and Weave-Net.

@ky-bd using VXLAN actually makes things a lot slower and increases the number of hops for some of my hosts (strangely) to mroe than 10 (while usually traceroute shows only one hops through tailscale0). If you could give more details on how you got around pod CIDR and separated it fromyour Tailnet, that would be great.

In any case, having a Tailscale CNI would be hugely beneficial.

Nov 06 '23 18:11 armaneshaghi

Note that you already (as I do) run a multi-cloud cluster over tailscale :) I enable the tailscale network on the host, then run flannel with --flannel-iface=tailscale0 No doubt a CNI would make the setup easier and more reliable, though 👍

@koying would you be able to provide instructions to get this working with Flannel? I have tried Flannel using this blog but end up with IP conflicts in the Pod overlay network. Very similar issues with Calico and Weave-Net.

@ky-bd using VXLAN actually makes things a lot slower and increases the number of hops for some of my hosts (strangely) to mroe than 10 (while usually traceroute shows only one hops through tailscale0). If you could give more details on how you got around pod CIDR and separated it fromyour Tailnet, that would be great.

In any case, having a Tailscale CNI would be hugely beneficial.

Here's part of my k3s config, just following though I have no idea how to integrate tailscale with other k8s distro or CNI, except for Flannel+VXLAN:

flannel-iface: "tailscale0"
vpn-auth: "name=tailscale,controlServerURL=SERVER_URL,joinKey=SOME_PREAUTH_KEY"
cluster-cidr: "10.142.0.0/16"
service-cidr: "10.143.0.0/16"

Also if you are using ACL, you'll need some additional ACL entries for k3s, something like:

acls:
  - action: accept
    src:
      - "tag:cluster-nodes"
      - "10.142.0.0/16"
    dst:
      - "tag:cluster-nodes:*"
      - "10.142.0.0/16:*"
autoApprovers:
  routes:
    10.142.0.0/16:
      - "tag:cluster-nodes"

I'm using headscale so it's in YAML. It should work in it's equivalent Tailscale's JSON form.

I'm not exactly sure what do you mean by 'seperate'. The pod and service CIDRs don't need to be inside the 100.64.0.0/10 block that Tailscale uses, in that sense they may be 'seperated'; However the pod CIDRs are advertised as subnet routers, so if machine A in the Tailnet has access to the machine B running k3s on it, then the pod CIDRs on machine B would be added to A's routing table. In that case, they are not seperated. It in fact applies to all subnet routers, and currently there's no way to selectively accept or block subnet routers in Tailscale, per my understanding.

Nov 12 '23 06:11 ky-bd

Did anyone ever tried using cilium over a tailscale network? Did someone get it to work?

When i use cilium's vxlan over tailscale pod2pod connections work, but a pod can not connect to a different node's external ip, and thus controller pods can't reach the kube-apiserver.

Edit: I got it to work by making sure that tailscale is forced to use iptables, but it needs to use iptables-nft, not iptables-legacy. Tailscale containers include both, but by default /sbin/iptables is linked to xtables-legacy-multi.

For some reason setting tailscale to use nftables directly results in cilium being unable to start because it fails to create firewall rules. Probably because cilium is internally using iptables-nft.

After overwriting it with a link to xtables-nft-multi everything seems to work.

May 15 '24 11:05 Preisschild