cni
cni copied to clipboard
Proposal / Discussion: gRPC Interface For CNI
I've been following Slack conversations and Kubecon presentations, and a topic that I've seen arise quietly is that of a gRPC interface for CNI. I can't find much beyond a Slack thread from June 2020 and a GitHub issue discussing how to invoke plugins without installing binaries on the host filesystem https://github.com/containernetworking/cni/issues/284 . I've been giving this some thought and trying to weigh the merits of an approach similar to that of CSI plugins vs. sticking with the current approach of exec'ing a binary. I wanted to try to jumpstart a conversation on this topic and gather some thoughts from others. I'm trying to work out whether it makes sense to invest time in this effort. My thoughts are going to be very kubernetes-centric and hopefully others will jump in here as well:
Pros:
- A gPRC interface provides a standard way of installing CNI plugins when dealing with a distro like CoreOS or MicroOS which assumes the host filesystem is immutable. IMO this is the most compelling use case for introducing some sort of RPC interface.
- From a June 2020 Slack thread in #cni - "Pros of an RPC would include: possible to ask about the overall state of the network." h/t @bboreham. I'm not sure I fully understand how this enables a CHECK action to provide state for the entire network, but the idea sounds intriguing on the surface.
- Such a framework may require plugins to register themeslves with the container orchestrator (ie kubelet's plugin registration mechanism), perhaps enabling a more explicit and intuitive way to declare a preferred plugin. ie an alternative to kubelet relying on a naming convention for config files in /etc/cni/net.d. This is perhaps a small detail and might just come down to personal preference, but maybe people will appreciate something more explicit.
Cons:
- Requires changes to the container orchestrator to be able to make use of the gRPC interface.
- Each CNI plugin would need to be modified to support a gRPC style invocation. Plugins like Calico and Cilium run with a daemon, so a scheme that augments with a gPRC interface might not be too much of a burden. What about plugins that don't run with a daemon and operate strictly off of exec'ing the plugin binary?
- Any mechanism requiring plugins to register themselves adds to the pain of adopting the gRPC interface. Not a show-stopper by any means, but it is yet one more burden placed on plugin maintainers.
Open Questions:
- Has this discussion been had in another forum that I just haven't seen?
- What does a gRPC interface provide that can't be achieved by invoking plugins as containers?
- Is the mechanism of exec'ing a binary in any way limiting? ie if it isn't broken don't fix it... right?
- Would plugins implemented in languages other than go (how many are there??) be incompatible with gRPC style invocation? I don't think so, but I may be missing something.
- What would we need to add to the CNI spec if we officially supported such an interface?
This is meant to seed a discussion on the topic. I have more questions than answers at this point. Please chime in if you have thoughts, I'd love to hear the thoughts of others on this topic.
@mccv1r0 did a bunch of work last year on gRPC PoC that allowed plugins to optionally implement gRPC. Are you interested in working on this, and would it be useful to take a look at his code? https://github.com/mccv1r0/cni/tree/gRPC
I am interested in working on this. I was not aware of the work of @mccv1r0, thanks for pointer. I would like to see the code. Before going too far I was just putting out some feelers to see if this is something that has broad enough interest to warrant some investment of my time.
possible to ask about the overall state of the network
I meant the runtime could make a call at startup saying "is the overall network ready?", and give better diagnostics to the administrator for common failure cases. The current CHECK command requires that you have already called ADD, so you'd be further down the line before you found out something is wrong.
require plugins to register themeslves with the container orchestrator
Can we learn from orchestrator plugin interfaces like CRI and CSI?
* perhaps enabling a more explicit and intuitive way to declare a preferred plugin
@rktidwell I dont' think gRPC magically provides this; it's either a Kubernetes-level or runtime-level decision which of the registered plugins to use for a given container. Also I'm not sure that CNI has anything to say here; becuase it supports multiple networks per container from day 1 (see podman for example). It's just the aversion to network configuration in Kubernetes that causes the issues around preferred/default plugin.
What does a gRPC interface provide that can't be achieved by invoking plugins as containers? Is the mechanism of exec'ing a binary in any way limiting? ie if it isn't broken don't fix it... right?
Each of the existing commands can operate exactly the same, but we can add new commands, for instance init and shutdown, that speak to the plugin as a whole rather than individual interfaces. So plugins like bridge and portmap can clean up their network devices and iptables rules.
It should be a little more efficient for plugins to hold state in memory rather than rediscovering it on every run, but this has not been reported as a practical issue to my knowledge.
plugins implemented in languages other than go [...] incompatible with gRPC
gRPC supports lots of languages, including C which many more can call.
Open Questions:
* Has this discussion been had in another forum that I just haven't seen?
It's been discussed extensively by the CNI maintainers. We have fairly clear ideas of a start for gRPC, but I guess those aren't written donw.
* What does a gRPC interface provide that can't be achieved by invoking plugins as containers?
I don't think "invoking plugins as containers' has anything to do with gRPC. Do you mean invoking plugins as executables for each ADD/DEL/etc oepration?
* Is the mechanism of exec'ing a binary in any way limiting? ie if it isn't broken don't fix it... right?
Yes it is limiting. It means plugins have to do more work/code to keep state, they cannot return asynchronous events to the runtime, and they cannot have "initialize/deinitialize yourself" calls that are global to the plugin and not per-container. Also, on Windows, execs are very heavy-weight.
* Would plugins implemented in languages other than go (how many are there??) be incompatible with gRPC style invocation? I don't think so, but I may be missing something.
I presume CNI would provide a go implementation like libcni already does. That does mean that it's harder for things that aren't written in Go to use gRPC becuase libcni wouldn't be available for that scaffolding. That all said, that's not a new problem.
* What would we need to add to the CNI spec if we officially supported such an interface?
We'd need to define the gRPC interface methods and parameters, signals/events, and the behavior of plugins in various cases. I think that's quite doable and there are lots of ideas floating around.
There's a ton to discuss here, but let's definitely continue that discussion. We should start a shared document where we put these ideas, and hten also define what we think is the first step that's actually achievable.
@rktidwell We'd love to have your help with that. Would you be willing?
(Also, I'm local to you (presuming your github location is accurate), not that we can really meet up to brainstorm these days...)
possible to ask about the overall state of the network
I meant the runtime could make a call at startup saying "is the overall network ready?", and give better diagnostics to the administrator for common failure cases. The current CHECK command requires that you have already called ADD, so you'd be further down the line before you found out something is wrong.
Thanks for the clarification, that makes more sense to me.
require plugins to register themeslves with the container orchestrator
Can we learn from orchestrator plugin interfaces like CRI and CSI?
That's what I was getting at. I haven't thought it through, but the design of CRI and CSI may very well be something of an inspiration.
* perhaps enabling a more explicit and intuitive way to declare a preferred plugin
@rktidwell I dont' think gRPC magically provides this; it's either a Kubernetes-level or runtime-level decision which of the registered plugins to use for a given container. Also I'm not sure that CNI has anything to say here; becuase it supports multiple networks per container from day 1 (see podman for example). It's just the aversion to network configuration in Kubernetes that causes the issues around preferred/default plugin.
No, it doesn't seem like CNI has anything to say here. I'm probably just projecting some thoughts about Kubernetes beyond the boundary of the container orchestrator. Ignore this, I agree that it's not relevant.
What does a gRPC interface provide that can't be achieved by invoking plugins as containers? Is the mechanism of exec'ing a binary in any way limiting? ie if it isn't broken don't fix it... right?
Each of the existing commands can operate exactly the same, but we can add new commands, for instance init and shutdown, that speak to the plugin as a whole rather than individual interfaces. So plugins like bridge and portmap can clean up their network devices and iptables rules.
It should be a little more efficient for plugins to hold state in memory rather than rediscovering it on every run, but this has not been reported as a practical issue to my knowledge.
Thanks for the explanation, init and shutdown scoped to the plugin is an interesting idea. As an example, I seem to recall that earlier versions of Cilium would run an init container that would handle pre-reqs like mounting the BPF filesystem. It seems like each plugin handles initialization in their own way, I can see some value in CNI providing an some guidance on this. FWIW I have yet to encounter a plugin that really cleans up after itself nicely, but I'm not sure how much of a burning issue that is in a production setting.
plugins implemented in languages other than go [...] incompatible with gRPC
gRPC supports lots of languages, including C which many more can call.
This is a non-issue, forget I mentioned it :wink:
I don't think "invoking plugins as containers' has anything to do with gRPC. Do you mean invoking plugins as executables for each ADD/DEL/etc oepration?
Yes, that's what I mean. I mentioned containers only in the context of https://github.com/containernetworking/cni/issues/284 which is one of the bits of information I came across while looking into what progress has been made on a gRPC interface. In the discussion thread the idea of exec'ing plugins via container entrypoint was floated, sorry for introducing some confusion by mixing threads. Delivering everything in a fully containerized way seems orthogonal to developing a gRPC interface, but it seems like it does make it easier to accomplish.
Is the mechanism of exec'ing a binary in any way limiting? ie if it isn't broken don't fix it... right?
Yes it is limiting. It means plugins have to do more work/code to keep state, they cannot return asynchronous events to the runtime, and they cannot have "initialize/deinitialize yourself" calls that are global to the plugin and not per-container. Also, on Windows, execs are very heavy-weight.
Thanks for that answer. Supporting asynchronous events is something that has been on my mind. One thing this might allow plugins to do is perform proactive health checks and report back to the runtime rather than have the runtime be burdened with performing CHECK actions to monitor health. Exec's do come with some overhead, I've been wondering if the reality is that this is a real issue or a hypothetical one.
@dcbw I would love to work with someone on this. I feel like there is some value that can be created by embarking on this effort. Yes, my github location is accurate. If things ever quiet down I'd love to whiteboard in person. For now, a shared doc seems like the best way to get started.
I took a look at some of the existing code linked to above, and while it's a good start to the interface itself the implementation seems to still exec plugins. I've been playing with some different approaches in my sandbox, and I've begun thinking that we may need to update the spec to account for some small additions to the network configuration.
The basic top-level keys are as follows:
cniVersion name type ipam dns
For exec'ing binaries, we communicate which binary to exec using the type
field. We need some way to communicate to the container orchestrator where a gRPC endpoint is. There are a couple of ways of approaching this that come to mind:
- Allow a URI to be passed in the
type
field which can be interpreted by the container orchestrator to determine whether to exec a binary, make a gRPC call to localhost over TCP, or make a gRPC call to unix domain socket. This has the benefit of not perturbing the spec too much. On the other hand, calling this atype
feels a little awkward (it becomes an inaccurate description IMO), and overloading the field in this way feels a little clumsy. - Introduce an optional
mechanism
field with supported values ofexec
andgrpc
. Whengrpc
is passed, thetype
field accepts a URI.mechanism
would default toexec
for backward compatibility. This might be an improvement on option 1 as the values accepted in thetype
field are no longer "magical". It also has the upside of maintaining backward compatibility. - Introduce an optional field called
rpc_uri
(I'm open to other naming suggestions) that becomes mutually exclusive withtype
.rpc_uri
becomes the way you pass the URI of the gRPC interface should you choose to use it. This is a subtle change to the spec in thattype
isn't strictly required, but it does maintain backward compatibility and allow for the optional introduction of gRPC style plugin invocation.
For an initial introduction of gRPC, I'm not seeing anything else to change in the spec at the moment. Of course there are the global setup and tear down actions discussed in this thread, but I don't feel the need to introduce them right away. However, if a single update to the spec that encompasses as much as possible in one fell swoop is desirable I'm fine with that.
These are my initial thoughts, I'll transfer them to a shared doc as soon as I get one started. In the mean time, feel free to poke holes.
For the record, we're definitely considering this. We'd like to cut the current spec (more or less) as 1.0, then start working on CNI 2.0 :-)
@squeed @mikebrow in an hypothetical case that kubernetes have a concept of network driver, how can we plug this NRI and CNI2.0 together?
"hypothetical case that kubernetes have a concept of network driver"
Outside the existing network provider? let's chat to put together a set of use cases maybe?
The idea in general is a resource manager plugin registers with a container runtime via NRI for CRUD notifications related to sandboxes(pods)/containers. That resource manager plugin can also integrate with kubelet level resource management apis... Where needed we should/will extend CRI to receive additional types of resource updates/requests.
I'd like to see NRI extended to support network crud / state change notices
We could merge containerd go-cni and cri-o's equivalent into a grpc/ttrpc service hosted by sandboxer/shim/container runtimes; we could extend CRI to include networking as another service (runtime, image, +network); we could do dynamic CRI network resource updates, specify config over CRI, add support for additional pod networks, ..
Lots of good options..
Lots of good options..
yeah, and I see a lot of request coming on Kubernetes that could make a good use of these options, for me the key is to align the different projects and to have a "common set of good options" and not "a lot of duplicate good options"
For example , https://github.com/kubernetes/enhancements/pull/3004 , this can define a priority for the network on the Pod and will need to pass parameters to the network plugin.
Multinetwork is another good example that can benefit for that.
We could merge containerd go-cni and cri-o's equivalent into a grpc/ttrpc service hosted by sandboxer/shim/container runtimes; we could extend CRI to include networking as another service (runtime, image, +network); we could do dynamic CRI network resource updates, specify config over CRI, add support for additional pod networks, ..
that sounds really nice to have @squeed
@aojea something that might be of interest is https://github.com/containerd/containerd/issues/7751
The idea in general is a resource manager plugin registers with a container runtime via NRI for CRUD notifications related to sandboxes(pods)/containers. That resource manager plugin can also integrate with kubelet level resource management apis... Where needed we should/will extend CRI to receive additional types of resource updates/requests.
@squeed https://github.com/kubernetes/kubernetes/pull/118203 FYI