pyroute2 icon indicating copy to clipboard operation
pyroute2 copied to clipboard

How to allow NDB to listen on specific netlink groups

Open cunha opened this issue 2 years ago • 4 comments

We use pyroute2 on software routers with millions of routes. It seems NDB loads a full snapshot of the namespace in its database, which in our case takes a significant time.

Is there a way to control what netlink groups NDB binds to?

If not, we would be willing to contribute changes to allow choosing which netlink groups NDB listens on. I am thinking of a similar change to the one we contributed back in 2016 to support the same functionality on IPDB: https://github.com/svinota/pyroute2/pull/314

Looking at the code, I thought of two ways to achieve this:

  1. Allow users to pass initialized IPRoute or NetNS as NDB sources. The user would be free to bind these IPRoute and NetNS instances to specific netlink groups as desired. (As of right now, it seems that only dictionaries specifying sources are accepted; IPRoute and NetNS are instantiated afterwards.) This is more-or-less what the user interface would look like:

    netns = NetNS('mynamespace')
    netns.bind(groups=desired_netlink_groups)
    src = { 'kind': 'netns', 'instance': netns }
    with NDB(sources=[src]) as ndb:
        ...
    
  2. Allow users to pass desired netlink groups as part of the NDB source specification. Then, the groups could be passed to netlink when binding the socket. This is what the user interface would look like:

    desired_netlink_groups = RTMGRP_DEFAULTS & (~RTMGRP_IPV4_ROUTE) & (~RTMGRP_IPV6_ROUTE)
    src = { 'netns': 'mynamespace', 'netlink_groups': desired_netlink_groups }
    with NDB(sources=[src]) as ndb:
        ...
    

Any preference or feedback on any of these approaches? (It seems to me 2 is easier to use and easier to implement!)

Would listening on a subset of groups violate NDB's assumptions? For example, if one configures NDB to not listen to route messages (like in the example above), then we could understandably raise an exception if the user accesses ndb.routes, but it's unclear to me whether NDB requires listening to all groups (in particular, it would need to support maintaining a partial database).

cunha avatar Aug 13 '22 17:08 cunha

The reason why NDB accepts a parameters dict instead of a netlink socket instance is that NDB tries to restart sources in the case of failures. NDB stores the parameters in the DB and instantiates the source from scratch in the case of the source error.

TBH I'm not sure it's not an architectural mistake. The reason behind the current implementation is that one instance can have many sources, including remote ones, it may be much cheaper to restart one source instead of the whole instance. If you have an opinion — pls share.

Given the current implementation, the second approach from the proposed looks much easier to implement at the moment, and I can include it in the comint release next week.

Keeping in mind that I'm still not sure that NDB uses a proper approach to manage the event sources.

svinota avatar Aug 13 '22 19:08 svinota

I haven't thought of multiple sources or restarting of sources, but these seem like more reasons to go for approach 2. (I can also see the motivation why NDB may want to manage the sources directly, so can't fault the current architecture.) It also seems that any changes to the management of event sources could carry over the functionality to choose netlink groups.

I'll try to get a PR ready in the next few days.

cunha avatar Aug 13 '22 19:08 cunha

You'll meet two weird things that will be refactored later.

  1. A builtin messaging protocol to interconnect NDB instances — is going to be dropped and replaced with 0mq or like that. That code may be safely ignored for now, like pyroute2.ndb.source.SourceProxy or pyroute2.ndb.main.NDB.__mm__().
  2. The way in which the DB management thread communicates with other parts of the system in general, and with the sources in particular. Basically it is a queue-based pipeline, and I plan to keep this architecture, but it will be rewritten in order to drop as many locks as possible. The issue is how to rewrite it without serious performance penalties; one attempt using concurrent.futures.ThreadPoolExecutor already failed due to poor performance in some cases.

svinota avatar Aug 13 '22 20:08 svinota

@cunha a draft fix is pushed: https://github.com/svinota/pyroute2/pull/997

More tests and some docs will follow tomorrow. The implementation as you proposed:

ndb = NDB(sources=[{..., 'netlink_groups': ...}])

https://github.com/svinota/pyroute2/blob/b550a71a8227198df9509be10bd61511f7c521b0/tests/test_linux/test_ndb/test_init.py#L15-L19

svinota avatar Aug 16 '22 01:08 svinota