Support plugin network stack
This commit supports a third-party network stack as a plugin stack for gVisor.
The overall plugin package structure is the following:
-
pkg/sentry/socket/plugin: Interfaces for initializing plugin network stack. It will be used in network setting up during sandbox creating.
-
pkg/sentry/socket/plugin/stack: Glue layer for plugin stack's socket and stack ops with sentry. It will also register plugin stack operations if imported.
-
pkg/sentry/socket/plugin/cgo: Interfaces defined in C for plugin network stack to support.
To build target runsc-plugin-stack, which imports pkg/sentry/socket/plugin/stack package and enables CGO:
bazel build --config=cgo-enable runsc:runsc-plugin-stack
By using runsc-plugin-stack binary and setting "--network=plugin" in runtimeArgs, user can use third-party network stack instead of netstack embedded in gVisor to get better network performance.
Redis benchmark with following setups:
- KVM platform
- 4 physical cores for target pod
- target pod as redis server
Runc: $redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 115207.38 requests per second, p50=0.215 msec GET: 92336.11 requests per second, p50=0.279 msec
$redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 113895.21 requests per second, p50=0.247 msec GET: 96899.23 requests per second, p50=0.271 msec
$redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 126582.27 requests per second, p50=0.199 msec GET: 95969.28 requests per second, p50=0.271 msec
Runsc with plugin stack: $redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 123915.74 requests per second, p50=0.343 msec GET: 115473.45 requests per second, p50=0.335 msec
$redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 120918.98 requests per second, p50=0.351 msec GET: 117647.05 requests per second, p50=0.351 msec
$redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 119904.08 requests per second, p50=0.367 msec GET: 112739.57 requests per second, p50=0.375 msec
Runsc with netstack: $redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 59952.04 requests per second, p50=0.759 msec GET: 61162.08 requests per second, p50=0.631 msec
$redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 52219.32 requests per second, p50=0.719 msec GET: 58719.91 requests per second, p50=0.663 msec
$redis-benchmark -h [target ip] -n 100000 -t get,set -q SET: 59952.04 requests per second, p50=0.751 msec GET: 60827.25 requests per second, p50=0.751 msec
Updates https://github.com/google/gvisor/issues/9266
Co-developed-by: Tianyu Zhou [email protected] Signed-off-by: Anqi Shen [email protected]
Sorry for my lateness, looking now.
@kevinGC Sorry, I just learned that you're out sick at the moment. There is no rush on reviewing this PR and please ignore the PING. Wish you all the best and get well soon.
Hi Kevin, thanks for reviewing this. For the fsgofer comment, sorry for the confusion, I mistakenly replied under other comments so probably make you miss it. Here is the link of the reply: https://github.com/google/gvisor/pull/9551#discussion_r1442789126
Hi @kevinGC , we refactor how to add plugin stack lib as a dependency in WORKSPACE.
- We add a local repository as a place holder and separate TLDK dependency in plugin-stack.BUILD.
- We add a compilation option: plugin-tldk to indicate that we will use tldk as a plugin stack.
- In plugin-stack.BUILD, we will check whether plugin-tldk is set and will only clone and compile the library if plugin-tldk is set; otherwise, tldk code will not be cloned.
We are not sure whether this will solve the issue you mentioned in https://github.com/google/gvisor/pull/9551#discussion_r1625139470. Please inform us if there is anything we could do to make it more suitable. Thanks a lot.
Hi @kevinGC , just coming back to checkout whether there is any comment on how we currently handle the plugin stack dependency. Again, thanks for any advice :)
I think we'll live with the dependency as-is.
An update here: this is difficult to merge internally due to this PR being the first to use the cdeps, clinkopts, and copts arguments of the go_library rule, which works slightly differently than Bazel's rules_go version. This requires internal work to support these.
There's no action needed from your end, just wanted to comment on why this is taking a while.