nsd
nsd copied to clipboard
Linux VRF support
Linux VRFs allow having multiple routing tables and binding sockets to one of these individually. This is similar to SO_SETFIB
on FreeBSD which is already implemented in nsd.
In order to bind a socket to a routing table, the SO_BINDTODEVICE
socket option is used with a VRF device name as value. nsd already provides support for this option but does not allow specifying the device name but instead figures out the device name for a given address by itself. I have already built a proof of concept that extends the ip-address
directive with an additional device
keyword argument that allows changing this device name.
However, to fully support VRFs, it should be possible to specify the device name each time a socket is bound to address, this mainly affects outgoing-interface
for selecting the routing domain to use for notifies, maybe it could also be useful for control-interface
. As far as I can tell, it's also not possible to use setfib there at the moment.
Before putting more time into creating a PR for this, I'd like to know if there would be interest in this feature and if there's any general feedback so far.
At RIPE NCC, we were looking at using Linux VRFs for our name servers, as an alternative to the policy-based routing we use now. However, we found that in order to use this feature, the software must co-operate. None of the existing DNS servers we use (BIND, Knot NSD, NSD) have this support. So we just dropped the idea for now. However, if there were support for VRFs, we might actually use this feature in the future.
Hi @julianbrost, @anandb-ripencc!
Background:
I'm currently working to get AF_XDP
sockets supported in NSD and to do so I'm going to have to shuffle around the way that sockets are configured/opened. This is required because the way NSD currently works is to open both UDP and TCP sockets for every ip-address
configured. With the addition of XDP
this solution will no longer work because for XDP sockets, you'd open one for a netdev/queue combination. What I'll be doing is to allow users to specify the socket type with the ip-address
option. e.g. ip-address: <address>[@<port>] [xdp|udp|tcp] <options>
. For XDP sockets you'd then be able to specify something like ip-address: <interface> xdp queue=<queue> servers=<server(s)>
. And you'd specify socket-server mappings as desired.
I'll have to look into the details of how VRFs work a bit, but I'm guessing the changes will allow for VRF support to be implemented without too much hassle. Since I'm planning to merge XDP support in stages, and the socket configuration changes are up first, I'll have a go at this too.
Thanks for the update Jeroen!
I'd like to make a suggestion here. NSD doesn't really have a stable/development release model. Everything gets committed and released as production, and this has hurt in the past when breaking changes have appeared in new versions.
Are you able to internally discuss the whole versioning thing, and perhaps convince your colleagues to maintain 2 versions of NSD? Make the current 4.4.x branch the stable one. You can also release a 4.5.x branch, which could be a development branch. You would have more freedom to break things there, or rework some ideas if they don't work too well. Eventually, when you feel that it's ready, you can release 4.6.x as the next stable branch. This is how BIND is doing it, and it seems to work well.
If you don't want the even/odd distinction as with BIND, then you could do what Knot DNS does, where they maintain 2 releases, both classed as production. Currently, they have 3.0.x and 3.1.x as supported versions. The 3.0.x branch gets no feature changes. Only bug fixes are applied to it. For 3.1.x, they do add new features, but again, nothing that will break a production system. For really big breaking changes, they will do it in 3.2.x. When 3.2 is released, 3.0 will be abandoned, such that 3.1 becomes the previous stable, and gets only bug fixes, whereas 3.2 can get new features.
All this makes it very easy for operators to maintain stable services, while still able to play with new things using newer versions on test systems. With NSD this is just not possible. So could you please bring this line of reasoning to your colleagues and see if the NSD versioning can be improved? You could also consider this for Unbound.
@anandb-ripencc, I believe there's two questions here?
- Will the configuration style break current configurations; and
- Can we (NLnet Labs) use a different strategy for maintaining stable branches to avoid breaking changes
As to the first question, specifying the socket type would be optional, it'd just fallback to udp+tcp if nothing is specified. I intend to ensure that we don't break any existing configuration files or introduce different behavior. Writing a couple of test cases is what I'd normally do here.
As to the second question: I see your point and I'll bring it up with the others to see how they feel about it. Depending on the urgency, maybe we should open a separate issue regarding release management? That way we can discuss in more detail and perhaps get others to join in too. If you don't mind though, let's keep focus on VRFs in this particular thread :slightly_smiling_face:
Hi Jeroen,
I appreciate that you wouldn't break existing configuration files with your changes, but that wasn't my point. There will still be new code, with the potential of unexpected behaviour, bugs, etc. And this is why I am so concerned with better versioning and release management. But, I won't comment on it here further. When you discuss it with your colleagues, and wish to have user input, feel free to open a separate issue on GitHub and notify us via the mailing list, so we can discuss that issue separately.
@julianbrost, I read the page you so kindly provided. To provide some background on why the SO_BINDTODEVICE
and SO_SETFIB
changes were merged was to increase throughput. Basically, to hook up a cpu core up to a dedicated nic to avoid cache misses. For Linux that's achievable through SO_BINDTODEVICE
, but with FreeBSD that's only possible by specifying a dedicated routing table. That's also the reason it's not implemented for other interfaces, though we could opt to allow a selection of socket options to be specified on a per-interface base. (nsd.conf.sample
included in the repo provides more detail on performance etc)
Before I (or you) go off and implement things, @julianbrost, @anandb-ripencc: can you describe the use-case a bit? I can imagine using multiple nics per dedicated cpu. i.e. if you have slower nics, use multiple to achieve a greater throughput? But that's from a performance point-of-view, maybe I'm missing some other obvious use-case?
For me it's just that I'm playing around with VRFs for other services on a host that happens to also run a nameserver as well. Nothing that can't be solved without VRFs, VRF support would be nice to have as it would allow me to remove some extra policy-based routing workarounds that I have specifically for the nameserver. So I looked if any nameservers already support this, didn't find one and thus looked at the source and found that nsd is probably the easiest one to add this.
Our use case has nothing to do with performance. We currently use policy routing to separate the management traffic and service traffic on a server. The first interface on a server is used for management (ssh, monitoring, etc). The second interface is connected to Internet exchanges or a host network. It receives DNS queries, and we want the DNS responses to go out the same interface. Policy routing works, so there is no reason to abandon it. But we recently looked at VRFs to see if we could use them. However, none of the name servers currently support them, so we just stuck to policy routing, and it is unlikely that we would use VRFs.