firecracker icon indicating copy to clipboard operation
firecracker copied to clipboard

vhost-net: create vhost based Net backend

Open majek opened this issue 1 year ago • 1 comments

Changes

vhost-net: create vhost based Net backend

This patch adds a second backend to Net devices. It can be enabled with 'vhost' bool on the network interface config, like this:

"network-interfaces": [
    {
        "iface_id": "eth0",
        "host_dev_name": "tap0",
        "vhost": true
    }
],

Vhost backend opens host kernel /dev/vhost-net interface, and performs a setup dance to setup the vhost device with the relevant tap interface. The effect is that all of the data plane goes directly between host kernel and the guest. The data doesn't go via firecracker VMM at all. This drastically reduces the packet latency and increases throughput, especially in a high-pps scenarios. For example UDP and TCP without offloads.

The control plane is somewhat hacky. Technically, the interrupts from host to guest should go through firecracker VMM, but this is avoidable by splicing the host eventfd into the guest interruptfd, and force-returning VIRTIO_MMIO_INT_VRING in the relevant virtio register.

There are couple of missing features:

  • persist (no blockers, just work)
  • mmds (no obvious way to do it, perhaps possible with ebpf)
  • rate_limiting (no obvious way to implement it, perhaps with ebpf)
  • tap/vhost feature negotiation

On the latter point, it would be nice to negotiate some more advanced tap/vhost features, like USO (UDP segmentation offload), TCP offloads (flag needed if guest wants to use XDP), VIRTIO_NET_F_MRG_RXBUF (this might be useful for performance, but benchmarks needed first). Right now there is no way to express these toggles in the net config, but this can be done in the future.

Reason

Discussion #3707

License Acceptance

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check CONTRIBUTING.md.

PR Checklist

  • [ ] If a specific issue led to this PR, this PR closes the issue.
  • [ ] The description of changes is clear and encompassing.
  • [ ] Any required documentation changes (code and docs) are included in this PR.
  • [ ] API changes follow the Runbook for Firecracker API changes.
  • [ ] User-facing changes are mentioned in CHANGELOG.md.
  • [ ] All added/changed functionality is tested.
  • [ ] New TODOs link to an issue.
  • [ ] Commits meet contribution quality standards.

  • [ ] This functionality cannot be added in rust-vmm.

majek avatar Feb 19 '24 13:02 majek

Up for discussion: testing, the errors are wrapped simpler than block-vhost-user, benchmarks, regenerate bindgen.sh to avoid declaring stuff like VIRTIO_NET_F_GUEST_USO4

majek avatar Feb 19 '24 13:02 majek