gatekeeper icon indicating copy to clipboard operation
gatekeeper copied to clipboard

About 40GbE nic port aggregation

Open ShawnLeung87 opened this issue 1 year ago • 2 comments

I want to use an intel XL710 40GbE NIC to aggregate (lacp) the front ports to 120Gb, is it better to use a NIC with one 40GbE port, or is it better to use a dual port 40GbE NIC?The dpdk documentation says that a single PCIe Gen3 cannot achieve 80GbE. Multiple PCIe Gen3 NICs, how should I configure lcores to solve the problem that different numa interfaces correspond to different PCIe Gen3

ShawnLeung87 avatar Sep 06 '22 06:09 ShawnLeung87

There is no straightforward answer for how to aggregate the NICs to have 120Gbps of front capacity because you will deal with many hardware constraints simultaneously. You are focused on key constraints, namely, PCI bandwidth and NUMA node communication. But you will hit other constraints: overflow of the cache of the CPUs, memory latency, and the bandwidth of memory buses (each NUMA node has a memory bus). Not to mention that the hardware and software overhead to glue it all will take CPU cycles away from processing packets. The problem here is not that your server won't work, but it likely won't be able to reach line speed when packets are small enough.

If you can, the better choice would be to load balance multiple Gatekeeper servers, each with 40Gbps front NICs. The load balancing must be done on the pair (source IP address, destination IP address) of the packets. Notice that the flow tables of these servers add horizontally so that you can have much more flow capacity with this setup. And more importantly, you are less likely to hit the hardware constraints with this setup.

The file lua/main_config.lua already does a good job allocating the lcores between the NUMA nodes. Beyond that, you'll have to experiment with your system to see if you can find a missed opportunity.

Finally, the big picture is that most offload features of current NICs are devoted to virtualization. That is, these features are not optimal for helping Gatekeeper process packets faster. This means that, very likely, there is no solution to simultaneously smoothing out all the constraints you will deal with. Napatech, a manufacturer of SmartNICs, is working to make their NICs tightly coupled with Gatekeeper. I expect their SmartNICs to start working with Gatekeeper v1.2. The beta of this integration should be out by December 2022. Once that integration is out, they intend to develop new features on their NICs specifically to enable Gatekeeper better address the hardware constraints discussed above.

AltraMayor avatar Sep 06 '22 12:09 AltraMayor

Current Napatech SmartNICs can achieve 100Gbps over PCIe Gen3 x16 interfaces. This is line speed performance. Supported link speeds can be 40Gbps or 100Gpbps. We are working on a Gen4 x16 SmartNIC that will support 200Gbps over the PCIe interface. Full NUMA control and RSS to up to 128 host ring buffers is supported. So RSS distribution of traffic can be to multiple instances of a host app, or multiple threads within a single host app. We currently support a wide variety of FPGA based L2-L4 filters. Today these filters are defined using a native filtering language, but we plan to support BPF through the DPDK BPF library.

psanders240 avatar Sep 06 '22 15:09 psanders240