danm icon indicating copy to clipboard operation
danm copied to clipboard

VRF support for interfaces created by DANM

Open carstenkoester opened this issue 4 years ago • 3 comments

Is this a BUG REPORT or FEATURE REQUEST?: feature

What happened: It'd be cool if DANM had the ability to place an interface within a VRF inside the pod network namespace - in other words, as per https://www.kernel.org/doc/Documentation/networking/vrf.txt, execute the equivalent of

ip link add dev ${vrf_name} type vrf table ${rt_tables}
ip link set dev ${interface_name} master ${vrf_name}

Idea would be to be able to declare a VRF name in the pod annotation, and let DANM create the VRF (for the first interface per VRF per pod), and then move the interface(s) into the VRF.

It seems that https://github.com/vishvananda/netlink supports the netlink calls to create VRFs (https://github.com/vishvananda/netlink/pull/186) so this might be reasonably straightforward to implement?

carstenkoester avatar Apr 26 '20 03:04 carstenkoester

yeah I think it would be reasonable complexity to implement, but what would be the use-case for it? so as far as I understood VRFs are used in practice to provide a form of tenant separation in a shared network namespace. DANM however operates inside one Pod's netns, which is already a very small subset of a K8s namespace which is already a small subset of all network resources. so currently we have the assumption all processes / threads inside the Pod belong to the same "user", as Pod's lifecycle is different from VMs - it is not really a K8s best practice to design long-running, huge, "host-platform" kind of Pods

could you elaborate how would you use this feature?

Levovar avatar Apr 27 '20 08:04 Levovar

I have a number of scenarios I can think of; the one I'm looking at currently, is a traffic forwarder pod (think of it like a regular router), that forwards end-user traffic from one interface to another. We're handing that end-user traffic in "raw" format (unencapsulated), which means we (a) need to isolate that traffic and (b) have no control over IP addressing used in that customer traffic.

Yet, I want that pod to have a 3rd interface of its own, where it can replicate some sort of HA state (that has to do with tracking/manipulating that customer traffic) to a second pod.

There's no particular urgency to this feature request -- and I might, in fact, look at that myself sometime and submit it as a PR. I can achieve the exact same thing by using the primitives of interfaces, ip rule list, and policy routes -- just that using VRFs would make the implementation look slightly cleaner.

carstenkoester avatar May 03 '20 21:05 carstenkoester

I understand. sounds a niché use-case but sure, why not :) the API through which this can be controlled needs to be defined first though. couple of thoughts here:

  • I'm not sure this should be application controllable through annotation, after all, the Pod should not be able say which VFR domain an interface belongs to, right? so for me this sounds more like a new attribute in networks' Spec.Options
  • and why just the first interface? we kinda prize ourselves universally supporting networking features for any and all interfaces, so again putting a new VRF attribute into the network would make the support universal, and enable multiple VRF domains within the same Pod

we just need to figure out the interworking between the new VF, and existing routes, Routes6, rt_tables, and proutes parameteres when all are defined

Levovar avatar May 05 '20 09:05 Levovar