kilo icon indicating copy to clipboard operation
kilo copied to clipboard

Request: Add feature to specify source ip address for all egress

Open tdondich opened this issue 2 years ago • 7 comments

We have nodes that have multiple public ip addresses that utilize same gateway (cloud provider is Scaleway, uses managed kilo k8s clusters for multi-cloud).

We want all outbound traffic for pods on these nodes to utilize a specific ip address. Unfortunately, that ip address is not in the same network as the gateway (route next-hop), therefore the usage of MASQUERADE in the iptables rule created by kilo will choose the non-preferred ip address, even if it's not the primary ip address of the interface.

Force adding a SNAT rule before the MASQUERADE rule in the KILO-NAT ruleset makes the behavior work as desired.

Ideally, we can specify an annotation for kilo's daemonset to specify SNAT over MASQUERADE and some sort of deterministic matching or behavior for what source ip address to utilize based on configuration.

Not sure on ideal implementation since each ip would be specific to the node. Opening it up for discussion.

tdondich avatar Aug 01 '23 19:08 tdondich

Hi @tdondich, this sounds very reasonable to me! I envision defining a new annotation that can be added to each individual node to define the egress IP addresses to use when routing packets off of the cluster. It would need to be a data structure that accepts one ipv4 address and/or one ipv6 address so that we can SNAT both kinds of packets.

It's a pretty explicit and as a result kind of tedious for the user however it's also the most terse and easy to implement.

For example, we could define the annotation kilo.squat.ai/egress-ips and allow up to two IPs to be used, where one must be ipv4 and the other ipv6. What do you think?

squat avatar Aug 01 '23 23:08 squat

First off, I just want to commend you for such a fast and positive response. You are a great example why the open source community is just amazing.

I do think that being explicit on ipv4 and ipv6 is preferred vs trying to have both in one annotation. May I recommend kilo.squat.ai/egress-ipv4-address and kilo.squat.ai/egress-ipv6-address ?

So to be clear then, the logic will have some evaluation to determine if annotations exist at say https://github.com/squat/kilo/blob/main/pkg/mesh/routes.go#L372 and will add a SNAT rule vs MASQUERADE rule?

And the annotation will be on the node yes? How often does the kilo cni refresh? I noticed that when I removed the MASQUERADE rule, it came back shortly after. So I think we need to handle the situation where the annotation is added after kilo has been added and been running so if the annotation exists, and the MASQUERADE rule exists, the MASQUERADE rule is removed and then SNAT added.

Thoughts? How else can I help?

tdondich avatar Aug 02 '23 00:08 tdondich

Thanks for the kind words @tdondich <3

The Kilo daemon (kg) is constantly watching and reconciling the node's state with the node's configuration. So no matter when a node is annotated, Kilo will receive an update event, read the new egress configuration, generate the new iptables rules, and apply them immediately to the node.

I don't love super long the names for the annotations, though this is mainly am aesthetics argument. The main reason I proposed to have both IPs in a list in one single annotation to keep in line with the way that you configure dual stack CIDR ranges in the Kubernetes apiserver, controller-manager, and kube-proxy: https://kubernetes.io/docs/concepts/services-networking/dual-stack/#configure-ipv4-ipv6-dual-stack

If you're keen to contribute, I'd happily review any PR :) the way to start would be:

  1. define a new annotation here, e.g. https://github.com/squat/kilo/blob/main/pkg/k8s/backend.go#L64
  2. add a new field to the mesh.Node type e.g. https://github.com/squat/kilo/blob/main/pkg/mesh/backend.go#L77
  3. translate the annotation to the new field e.g. https://github.com/squat/kilo/blob/main/pkg/k8s/backend.go#L322-L330
  4. Add an egress IPs field to the Topology type e.g. https://github.com/squat/kilo/blob/main/pkg/mesh/topology.go#L53-L54
  5. Set the egress IPs on the Topology during instantiation of the Topology e.g. https://github.com/squat/kilo/blob/main/pkg/mesh/topology.go#L136
  6. Override the iptables rules to SNAT using the egress IPs if present, here https://github.com/squat/kilo/blob/main/pkg/mesh/routes.go#L372-L373

squat avatar Aug 02 '23 00:08 squat

Now that you referred to the dual stack examples, yes, having a single annotation makes more sense then. Just, those examples are for CIDR ranges where this is for a specific ip address. I mean, we can still use /32 if really needed, just that seems weird if we accidentally allow the user to put in something other than a /32. And, there COULD be a situation where someone wants SNAT for ipv4 but not for ipv6 and vice versa. How would you like to go about that?

I'd love to attempt to contribute via my terrible knowledge of golang, but I'd love to take this time to attempt a PR. How best do you find it to test these particular scenarios locally? Would love to get your thoughts on a local test lab, if it can be virtualized or not.

tdondich avatar Aug 02 '23 00:08 tdondich

@tdondich I imagine handling the different cases for egress IPs similar to how the Kubernetes apiserver/controller-manager do:

  • accept either one ipv4 or one ipv6 or one ipv4 and one ipv6
  • both can be given in either order or just one can be given
  • if multiple ipv4 are given, only the first is used, same with ipv6

Additionally, I think that the value of the annotation should be an IP and not a network range, ie no /X at the end.

Finally, looking again at the Kubernetes configuration for dual stack, the flag names are all singular so rather than kilo.squat.ai/egress-ips, the annotation should be kilo.squat.ai/egress-ip.

squat avatar Aug 02 '23 16:08 squat

Regarding tests, the package that will require the most complicated logic is probably the parsing and proper interpretation of the annotations to produce an egress IP data structure on the node type. Luckily, this package is very well tested and the unit tests should give us very high confidence in the proper functioning of the code:

  • Unit tests for validating that the new annotation produces the correct data structure for the mesh.Node: https://github.com/squat/kilo/blob/main/pkg/k8s/backend_test.go#L199-L229

We would probably be well served to add new unit tests to https://github.com/squat/kilo/blob/main/pkg/mesh/routes_test.go to test the correct generation of iptables rules now that the logic is slightly more complicated. I think that between these two tests I would be very confident in the code.

Finally, we could add new e2e test scenarios to prove that egress works as expected. The e2e test suite runs on kind, which runs Kubernetes clusters in Docker "nodes". However, there might be some subtleties around how Docker itself NATs/masquerades outbound packets that might make it hard to check if the egress IP is correctly being used as the source. I'm not sure if Docker does do this and if so, which packets would be affected.

squat avatar Aug 02 '23 16:08 squat

hi @tdondich how is this going? If you're stuck or don't have time to finish that's fine! Let me know and I'd be happy to take over and release this feature. I think it would be a welcome addition for the project.

squat avatar Oct 04 '23 17:10 squat