PACKET_MMAP version 1/2 improvements
Description
The fdbased LinkEndpoint supports PACKET_MMAP, but is missing a few useful improvements:
- TX: we TX using
sendmmsg, but should use PACKET_MMAP buffers. - RX: the PACKET_MMAP dispatcher seems to return 1-3 packets at a time, while the recvmmsg dispatcher often returns 8. We should investigate this, as PACKET_MMAP should return many packets at once and ultimately be faster than recvmmsg.
This is specific to PACKET_MMAP TPACKET_V1 and TPACKET_V2, not V3. V3 is unsuitable as a dispatcher because it markedly increases latency (by O(milliseconds)) in order to reduce CPU usage.
Is this feature related to a specific bug?
No
Do you have a specific solution in mind?
I believe we only have 32 slots (tpFrameNR) in the dispatcher. Could that be limiting the number of returned packets?
Earlier this year I ran experiments across lots of different frame/block sizes with tPACKET v3. The conclusion I came to was that PACKET_MMAP trades off raw packet throughput for CPU efficiency. This is in line with its stated goal of being a "more efficient way to capture packets", not necessarily tx/rx packets like we want it to. Waiting for a block to fill up was always slower than getting them from rcvmmsg. Feel free to take a crack at it, totally possible I missed something. But just a warning: HOURS_WASTED_HERE=~100.
IIUC that's specific to V3. I am thinking that PACKET_MMAP improvements would be for the existing V2 interface we use.
Yeah could be. I definitely did less testing on V2 since I thought v3 would be the answer. FWIW when I tried to increase the number of frames (in v2) I didn't see any effect.
V3 is not going to work.v3 has a hard coded block time of 8ms? It's optimized for bulk packet capture (aka tcpdump). That's the reason I used v2. If you do a simple ping with V3 you will see the latency spike all over the place.
I've updated the title and description to make it clear that this issue is not related to use of V3, but rather is about improving the dispatcher using V1 or V2.