nomad icon indicating copy to clipboard operation
nomad copied to clipboard

support CNI_ARGS parameters

Open tgross opened this issue 2 years ago • 6 comments

Some CNI plugins support accepting arguments in addition to the plugin config in JSON. (See https://www.cni.dev/docs/conventions/). They typically are passed in the CNI_ARGS environment variable, because passing them in the JSON args field would mean needing separate files for each argument variant. In Nomad (and k8s for that matter), the JSON config isn't typically accessible to job submitters.

Some examples where this is useful:

  • The static plugin for static IP address assignment supports an address for the static IP and gateway, which are useful for cases where IPs are being assigned outside of Nomad.
  • The tc-redirect-tap plugin (used for Firecracker VMs) supports arguments for per-VM jailer users/groups and tap names.
  • This likely helps more complex third-party plugins like Cilium as well, but I'm not as familiar with those use cases.

tgross avatar Feb 16 '23 13:02 tgross

Related: https://github.com/hashicorp/nomad/issues/13520

tgross avatar Feb 17 '23 14:02 tgross

https://github.com/hashicorp/nomad/issues/12120#issuecomment-1287378252 shows an example of how CNI_ARGS might need to be set when using Cilium CNI. From that you can see that the CNI_ARGS value would likely need to be a template of some sort that allows setting values based on the allocation.

EDIT: now that I've looked at the CNI library interface it looks like it accepts a map[string]string so a template doesn't quite work.

deverton-godaddy avatar Oct 03 '23 05:10 deverton-godaddy

is it possible to get at least, the network.hostname parameter passed in CNI_ARGS ?

nathanaelle avatar Nov 06 '23 15:11 nathanaelle

I've done a bit of experimenting with this, and while I was hoping we could include interesting metadata like this across the board without caveat, some plugins reject unknown keys being passed in, e.g. with bridge:

[...] plugin type="bridge" failed (add): ARGS: unknown args ["some_neat_key=some_value"]

Fairly sure that comes from here: https://github.com/containernetworking/cni/blob/v1.0.0/pkg/types/args.go#L117-L120 which does offer a way to disable the strict behavior, by passing a special arg: IgnoreUnknown=true

This patch seems to be effective in my little experiment, with a custom plugin in a CNI config chain after bridge:

--- a/client/allocrunner/networking_cni.go
+++ b/client/allocrunner/networking_cni.go
@@ -102,7 +102,19 @@ func (c *cniNetworkConfigurator) Setup(ctx context.Context, alloc *structs.Alloc
        var res *cni.CNIResult
        for attempt := 1; ; attempt++ {
                var err error
-               if res, err = c.cni.Setup(ctx, alloc.ID, spec.Path, cni.WithCapabilityPortMap(getPortMapping(alloc, c.ignorePortMappingHostIP))); err != nil {
+               if res, err = c.cni.Setup(ctx, alloc.ID, spec.Path,
+                       cni.WithCapabilityPortMap(getPortMapping(alloc, c.ignorePortMappingHostIP)),
+                       // "labels" get turned into CNI_ARGS env for CNI plugins to consume.
+                       cni.WithLabels(map[string]string{
+                               // some plugins (ex bridge) reject unknown fields by default,
+                               // unless this is set.
+                               "IgnoreUnknown": "true",
+                               // nomad-specific info for plugins to use
+                               "nomad_alloc_id":  alloc.ID,
+                               "nomad_namespace": alloc.Namespace,
+                       }),
+               ); err != nil {
                        c.logger.Warn("failed to configure network", "error", err, "attempt", attempt)
                        switch attempt {
                        case 1:

It produces this env var in the CNI plugin execution:

CNI_ARGS=IgnoreUnknown=true;nomad_alloc_id=667314a9-a492-f308-4cbd-6f1bf9e5c9be;nomad_namespace=default

So that's promising, but I'm still somewhat reluctant to make this less-strict behavior the default...

The potential usages do feel pretty compelling to me, though. It could obviate the need for intermediate Nomad->other CNI plugins to hit Nomad API themselves. Or even further downstream, programs that run out of band to reconcile Nomad alloc state with Cilium endpoints, for example.

gulducat avatar Nov 22 '23 20:11 gulducat

Would it make sense to extend the network block configuration to allow specifying additional CNI_ARGS values? I'm thinking something like

network {
  mode = "cni/mynet"
  cni {
	K8S_POD_NAMESPACE = "${NOMAD_NAMESPACE}"
	K8S_POD_NAME      = "${NOMAD_TASK_NAME}"
  }
}

deverton-godaddy avatar Mar 15 '24 01:03 deverton-godaddy

Yeah, I think that's roughly what we're looking at. I've got a branch that should be landing in the next little while that adds the plumbing for CNI args without exposing them to the user yet (to support https://github.com/hashicorp/nomad/issues/10628). And then we'll be able to come back and add a jobspec field pretty easily!

tgross avatar Mar 15 '24 13:03 tgross

This is done! It will be shipped in 1.8.2.

martisah avatar Jul 12 '24 15:07 martisah

Hi @tgross, I have a follow up question if you do not mind. What do you mean by:

because passing them in the JSON args field would mean needing separate files for each argument variant.

in your initial post? As far as I understand this is exactly ment for the orchestrator/container runtime to pass in additional args -- mesos does this as well https://mesos.apache.org/documentation/latest/cni/#passing-network-labels-and-port-mapping-information-to-cni-plugi

EDIT:// And since we are now setting IgnoreUnknown unconditionally (https://github.com/hashicorp/nomad/blob/682c8c0c8123537c42e7534ba792507b26672b85/client/allocrunner/networking_cni.go#L128) can we just add NOMAD_NAMESPACE etc by default (would open a PR). This would allow 3rd party CNI plugins to more easily integrate with nomad.

EDIT2:// Or just like the transparent proxy does for the iptables stuff we could json serialize an object like:

{
"namespace": …
"job":
"group":
"task":
"alloc-id":
}

Passing in via the json args would imo also work, we'd just have to reparse the parsed CNI data and inject the data there.

apollo13 avatar Aug 15 '24 10:08 apollo13

What do you mean by:

because passing them in the JSON args field would mean needing separate files for each argument variant.

With the caveat that I'm trying to remember the intent of my unclear wording from over a year ago... :grinning:

When a job has something like network.mode = "cni/example", Nomad uses the CNI library to load the configuration without doing any of the JSON parsing itself. So to modify config file, we'd need to either have a separate file for plugging in the args fields via interpolation, or we'd need to parse the JSON in Nomad, modify it, serialize it back to JSON, and then pass those bytes into the CNI library again so it could re-parse the JSON. Which is wasteful.

Rather than doing that, we've just used the CNI library's WithArgs method to set the args. The library owns how those args actually get passed into the CNI plugins and Nomad doesn't really care about that (in practice, it's via the CNI_ARGS environment variable, but that's an under-the-hood implementation detail of the official library).

And since we are now setting IgnoreUnknown unconditionally (...) can we just add NOMAD_NAMESPACE etc by default (would open a PR). This would allow 3rd party CNI plugins to more easily integrate with nomad. ... Or just like the transparent proxy does for the iptables stuff we could json serialize an object like:

Are there existing third-party plugins that can react to this value? Or are you talking about custom plugins? I definitely like the idea of having some kind of "Nomad CNI protocol" of expected arguments, or the format of the JSON blob as you suggested (although I guess it could actually be shaped like anything). But we'd want that to be a stable interface for downstream plugin authors to rely on it.

Can I get you to open a new issue to discuss that with any thoughts you have / context on how you might use it? There's a project going on exploring VM networking and I'd like to pull the folks looking at that into the conversation.

tgross avatar Aug 15 '24 13:08 tgross

or we'd need to parse the JSON in Nomad, modify it, serialize it back to JSON, and then pass those bytes into the CNI library again so it could re-parse the JSON. Which is wasteful.

Yeah I was thinking about doing that. Wasteful for sure, but compared to spawning a few process (bridge etc) probably not that bad.

Rather than doing that, we've just used the CNI library's WithArgs method to set the args. The library owns how those args actually get passed into the CNI plugins and Nomad doesn't really care about that (in practice, it's via the CNI_ARGS environment variable, but that's an under-the-hood implementation detail of the official library).

Well, I'd argue it is an explicit contract and not an implementation detail. But okay, works well enough.

Are there existing third-party plugins that can react to this value? Or are you talking about custom plugins? I definitely like the idea of having some kind of "Nomad CNI protocol" of expected arguments, or the format of the JSON blob as you suggested (although I guess it could actually be shaped like anything). But we'd want that to be a stable interface for downstream plugin authors to rely on it.

Currently not, there are CNI plugins reacting to K8S_POD_NAMESPACE and K8S_POD_NAME which is what k8s uses. But since those quite often also trigger integration with the kubeapi I rather not reuse those (especially since the concept of a pod doesn't map exactly to nomad). In my case I wish to add nomad support to calico, but without having nomad providing some useful values I am off to a bad start (Kind of chicken egg problem really, if nomad provided some values then CNI plugins could use that…). In an ideal world nomad would provide enough information there that basic network policies can be applied without calling back to the nomad API.

Can I get you to open a new issue to discuss that with any thoughts you have / context on how you might use it? There's a project going on exploring VM networking and I'd like to pull the folks looking at that into the conversation.

Will do.

apollo13 avatar Aug 15 '24 13:08 apollo13