firecracker
firecracker copied to clipboard
vhost-net: create vhost based Net backend
Changes
vhost-net: create vhost based Net backend
This patch adds a second backend to Net devices. It can be enabled with 'vhost' bool on the network interface config, like this:
"network-interfaces": [
{
"iface_id": "eth0",
"host_dev_name": "tap0",
"vhost": true
}
],
Vhost backend opens host kernel /dev/vhost-net interface, and performs a setup dance to setup the vhost device with the relevant tap interface. The effect is that all of the data plane goes directly between host kernel and the guest. The data doesn't go via firecracker VMM at all. This drastically reduces the packet latency and increases throughput, especially in a high-pps scenarios. For example UDP and TCP without offloads.
The control plane is somewhat hacky. Technically, the interrupts from host to guest should go through firecracker VMM, but this is avoidable by splicing the host eventfd into the guest interruptfd, and force-returning VIRTIO_MMIO_INT_VRING in the relevant virtio register.
There are couple of missing features:
- persist (no blockers, just work)
- mmds (no obvious way to do it, perhaps possible with ebpf)
- rate_limiting (no obvious way to implement it, perhaps with ebpf)
- tap/vhost feature negotiation
On the latter point, it would be nice to negotiate some more advanced tap/vhost features, like USO (UDP segmentation offload), TCP offloads (flag needed if guest wants to use XDP), VIRTIO_NET_F_MRG_RXBUF (this might be useful for performance, but benchmarks needed first). Right now there is no way to express these toggles in the net config, but this can be done in the future.
Reason
Discussion #3707
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.
PR Checklist
- [ ] If a specific issue led to this PR, this PR closes the issue.
- [ ] The description of changes is clear and encompassing.
- [ ] Any required documentation changes (code and docs) are included in this PR.
- [ ] API changes follow the Runbook for Firecracker API changes.
- [ ] User-facing changes are mentioned in
CHANGELOG.md. - [ ] All added/changed functionality is tested.
- [ ] New
TODOs link to an issue. - [ ] Commits meet contribution quality standards.
- [ ] This functionality cannot be added in
rust-vmm.
Up for discussion: testing, the errors are wrapped simpler than block-vhost-user, benchmarks, regenerate bindgen.sh to avoid declaring stuff like VIRTIO_NET_F_GUEST_USO4