bypass4netns
bypass4netns copied to clipboard
Use `SECCOMP_ADDFD_FLAG_SEND`
To inject it at socket(2) time safely, though, we need to use
SECCOMP_ADDFD_FLAG_SEND
in the addfd call. I added that flag to the kernel due to a race condition you can easily hit otherwise: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.17-rc2&id=0ae71c7720e3ae3aabd2e8a072d27f7bd173d25c.
Originally posted by @rata in https://github.com/rootless-containers/bypass4netns/issues/1#issuecomment-1027948113
Are you planning to switch to the socket syscall, as I suggested, then? Let me know if you have any doubts about the flag or if I can help :)
Thanks, but I guess we should just try adapting the current code to use SECCOMP_ADDFD_FLAG_SEND
first, and then try hooking socket(2)
I don't think it is needed to use the flag now. The upstream kernel commit says connect(2) when it should say socket(2). As I explained in the comment you linked here, if you use the "newfd" field when issuing the addfd ioctl, this race won't be a problem. It will be a problem if you handle socket, not connect.
The thing is, if the container received EINTR between the agent did the addfd and before it answered the syscall, it will be retried. If the agent does the addfd again without setting the newfd, then a new fd will be allocated. This can happen several times and the container end up with N fds, instead of just 1. But if you always use the "newfd" number, then even if you inject the fd several times, you close the old one (it has the same fd number, that is what addfd does if newfd is currently in use) and therefore there is no leak :)