tailscale
tailscale copied to clipboard
FR: Additional configuration for k8s-operator spawned pods
What are you trying to do?
the k8s-operator creates a new Statefulset
per exposed service - the operator allows configuration of some parameters via env var; e.g PROXY_IMAGE
- however there are only a limited subset of configurable options. So far, I've had to handle these in a mutating webhook:
- tolerations
- node affinity
- requests/limits
- env vars; e.g
TS_EXTRA_ARGS
(need to add--netfilter-mode=off
to avoid tailscale blackholing routes) - container
postStart
lifecycle hook (foriptables
related to the above netfilter argument)
This is not an exhaustive list of pod spec/statefulset spec that should be exposed and configurable
How should we solve this?
Support for this can be added by exposing additional env vars (this can get out of hand pretty quickly), providing the operator a statefulset template in a configmap it then patches, or provide a json patch in a configmap the operator applies to the statefulset spec it generates
What is the impact of not solving this?
Without a solve for this, end users will either have to deploy statefulsets manually (which largely defeats the utility of the operator) or use mutating webhooks
Anything else?
No response
Hi @ChandonPierre thanks for opening the issue.
This is high on our priority list, there is already an issue for that https://github.com/tailscale/tailscale/issues/10709, I am at the moment working on a design for this. I might open it for public comments once it's ready as user feedback would be very valuable for us on this one.
Custom user-defined labels and other customization can now (v1.60) be applied via a ProxyClass custom resource https://tailscale.com/kb/1236/kubernetes-operator#cluster-resource-customization-using-proxyclass-custom-resource.
We are planning to eventually add more customization options to ProxyClass. Feedback on whether it fits your workflow + what other fields you might need to configure (and why) is very welcome.
I can confirm that ProxyClass solved my particular use-case (adding tolerations to run proxy pods on control-nodes). I think the current solution would greatly benefit from a cluster-scoped (i.e non-namespaced) ClusterProxyClass
.
The tailscale operator is typically deployed once in the tailscale
namespace by the administrator administrator, and can then be used in many other namespaces. As it currently stands, if a particular cluster has a requirement that applies to all TS proxies created, then each namespace needs to duplicate the appropriate ProxyClass config. Not the end of the world of course, but increases maintenance headache...
Hi @samcday , thank you for confirming that it solves your use case!
would greatly benefit from a cluster-scoped (i.e non-namespaced) ClusterProxyClass
ProxyClass
is cluster scoped . It should be possible to create one ProxyClass
and refer to it from any proxy within any namespace - are you seeing issues with that?
We did not name it ClusterProxyClass
because we were not planning on making a namespace-scoped ProxyClass
, so the Cluster
- prefix seemed redundant.
Probably we should clarify that it's cluster-scoped in docs/CRD comments.
Haha! I'm not sure why I assumed it wasn't cluster scoped. My mistake :)
Hey @irbekrm - the ProxyClass
CR is great and addresses a lot of the concerns outlined in the original issue!
Though unfortunately since the entire pod spec is not exposed, we still can't inject container postStart
:
container postStart lifecycle hook (for iptables related to the above netfilter argument)
We still mutate this lifecycle postStart
on proxy pods in flight: while ! ip -f inet addr show tailscale0 &> /dev/null; do sleep 1; done; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE"
Hi @ChandonPierre , we did not expose the whole Pod
spec via ProxyClass
because if we did, we would inherit the whole of upstream validation rules (that would prevent folks from adding just select fields as that would not pass as a valid Pod
) and also because of the size of the CRD (4000 lines). Instead we opted for adding the fields folks need.
We still mutate this lifecycle postStart on proxy pods in flight: while ! ip -f inet addr show tailscale0 &> /dev/null; do sleep 1; done; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE"
Why do you need this?
We still mutate this lifecycle postStart on proxy pods in flight: while ! ip -f inet addr show tailscale0 &> /dev/null; do sleep 1; done; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE"
Why do you need this?
We overlap the RFC6598
range; this is the reason for #12115
Makes sense, I'm generally happy for us to add a Lifecycle
field to the tailscale container. If you would like to create a PR for this, please go ahead else I will put something in our backlog.