Conserve PCIe bandwidth
PCIe bandwidth is more scarce than it used to be. The host interface has to be designed to use PCIe bandwidth efficiently. This will probably involve organizing DMA into fewer longer contiguous transfers rather than more smaller scattered ones.
How come PCIe bandwidth is more scarce? It's because Ethernet bandwidth has been increasing in powers of 10 while PCIe bandwidth in powers of 2. The basic numerical relationship has changed with the transition from 10G/40G to 25G/100G:
| Ethernet bandwidth | PCIe bandwidth | PCIe-to-Ethernet ratio |
|---|---|---|
| 10G | 16G (PCIe 2.0 x4) | 1.6x |
| 40G | 64G (PCIe 3.0 x8) | 1.6x |
| 25G | 32G (PCIe 3.0 x4) | 1.28x |
| 50G | 64G (PCIe 3.0 x8) | 1.28x |
| 100G | 128G (PCIe 3.0 x16) | 1.28x |
| 200G | 256G (PCIe 4.0 x16) | 1.28x |
In the good old days of 10G/40G the PCIe links had 60% extra capacity for overhead such as transferring DMA descriptors and for the PCIe protocols themselves. These modern times of 25G/100G are leaner and only 28% is available. This means that we must treat PCIe bandwidth as a scarce resource because any wastage is likely to actually impact operational performance.
Recommended reading: R. Neugebauer, G. Antichi, J. Zazo, Y. Audvevich, S. López-Buedo, A. Moore, Understanding PCIe performance for end host networking, SIGCOMM 2018
The paper isn't available yet, it will probably appear by early July. They've done some really interesting experiments there (I read an earlier draft last year). I'm sure the authors are happy to share a pre-print with you if you ask nicely.