Add NUMA support and optimizations
Summary: This pull request adds support for Non-Uniform Memory Access (NUMA) to Apache Traffic Server, enhancing performance on NUMA systems.
Key Changes:
- Added CMake options to enable NUMA support and debugging.
- Introduced new configuration options for NUMA optimizations.
- Enhanced thread and memory management to be NUMA-aware.
- Added RamCacheContainer for cache duplication across NUMA nodes.
- Integrated NUMA debugging utilities.
Performance testing has shown increased throughput, reduced CPU usage, improved latency, decreased UPI bus load
These changes aim to optimize memory access patterns and reduce latency on NUMA systems.
To build the application with NUMA support, use the following CMake command:
cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_MIMALLOC=ON -DENABLE_NUMA=ON -DENABLE_HWLOC=ON -Dmimalloc_DIR="
Nice! The PR uses the old Debug() statements, which needs to be changed to the new Dbg / Ctl features:
../src/iocore/net/Server.cc:247:3: error: use of undeclared identifier 'Debug'
Debug("numa", "[Server::listen] Attempting to create socket with family: %d, type: %d, protocol: %d", addr.sa.sa_family,
^
../src/iocore/net/Server.cc:256:3: error: use of undeclared identifier 'Debug'
Debug("numa", "[Server::listen] Attempting to set up fd for listen with non_blocking: %d, options: %d", non_blocking, opt);
^
Hi all, I've addressed the feedback and made the necessary changes. However, the automated tests are still showing build errors on some operating systems. Could someone please help review these issues? Thank you!
It looks like its still related to when numa is not available:
../src/iocore/net/QUICPacketHandler.cc: In member function 'virtual void QUICPacketHandlerIn::_recv_packet(int, UDPPacket*)':
../src/iocore/net/QUICPacketHandler.cc:270:65: error: no matching function for call to 'EventProcessor::assign_thread(const int&)'
270 | eth = eventProcessor.assign_thread(ET_NET);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
In file included from ../src/iocore/net/../eventsystem/P_EventSystem.h:46,
from ../src/iocore/net/P_Net.h:92,
from ../src/iocore/net/QUICPacketHandler.cc:25:
../src/iocore/net/../eventsystem/P_UnixEventProcessor.h:54:1: note: candidate: 'EThread* EventProcessor::assign_thread(EventType, int)'
54 | EventProcessor::assign_thread(EventType etype, int numa_node)
| ^~~~~~~~~~~~~~
../src/iocore/net/../eventsystem/P_UnixEventProcessor.h:54:1: note: candidate expects 2 arguments, 1 provided
You might need to add more precompiler conditions for TS_USE_NUMA to keep the errors down for systems that don't support this.
Unfortunately CMakeLists.txt is conflicting. You can just accept both the new and old line when resolving.
@tlichwala Are you planning on continuing to work on this? Feel free to hit me up on slack to discuss these remaining issues. I think this work is valuable so don't want it to get too stale. Thanks!
@tlichwala Are you planning on continuing to work on this? Feel free to hit me up on slack to discuss these remaining issues. I think this work is valuable so don't want it to get too stale. Thanks!
Hi Chris, the code is ready for merging, but we currently lack the resources to address the FreeBSD test issues.
Bump
This pull request has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.