exanic-software
exanic-software copied to clipboard
Support network namespaces in exasock
This aims to introduce partial network namespaces support with minimal changes in the code.
The main challenge in implementing netns is that different processes (and sockets) can have different routing tables. Therefore exasock-dst tables must either be additionally indexed by netns inum or held separately for each namespace. Because my aim was to minimize performance impact of this addition I opted for a separate state which allows us to continue using 32-bit hashing.
Most fixes needed are various parts of the module now passing around network namespace structures. No userspace changes are needed now.
Current main limitation is that in several places it's still assumed that "socket namespace = process namespace" which is not necessarily true. To fix this we'd need to make userspace aware of socket's namespace and use tables accordingly. This would also require changes in several ioctls so that userspace can mmap several destination tables and request exasock_dst_queue
for a given netns.
Another small edge case is that we no longer update destination table when we resend SEQ because this would mean we'll need to track network namespace in TCP requests -- IMO it's too much complex code for too little a gain.
Overall this is not a complete solution because of the reasons above but it improves things to the point where it covers most cases (containers, isolated processes etc.) with understandable patch and no hot path changes.
Thank you for the pull request, we are looking at it. Can you comment on what happens when you run this with multiple namespaces? Our concern is that it does something graceful or at least not-too-unexpected in this case.
Same thing that happens now when you try to use exasock with non-init netns, which is not too graceful. Usually the socket will just go on unaccelerated. However if both (different) process and socket netnses have an exanic interface with intersecting IP address range strange things can happen (I didn't test that but I think packets will go through the wrong interface).