numaprof
numaprof copied to clipboard
NUMAPROF is a NUMA memory profliler based on Pintool to track your remote memory accesses.
I just found numap (https://github.com/numap-library/numap) which is used by NumaMMA (https://github.com/numamma/numamma). Maybe we can also use it in place of pin as an instrumentation library and redirect the calls to...
Currently we copied back some functions from libnuma inside numaprof, we should use directly libnuma. But difficulty, we cannot link easily to libnuma inside the pintool plugin, this require some...
When profiling an appliacation using CUDA and OpenCL we might be interesting to also account the memory transfert NUMA effects. Might be easy just by capturing the CUDA/OpenCL memory transfert...
To avoid using the unpinned metrics we can think patching the linux kernel to provide a signal to tell numaprof a non bound thread is moved to another numa node...
Even if we cannot get mult-threading performance it could be nice to also be supported as a valgrind plugin.
List fields meaning and link between section so people can resuse the file more easily in other tools.
Add hook support to start and stop instrumentation to track only a sub-part of the application.
Even if valgrind is not scalable this might be cool to have support.
Try to compute an estimation of the bandwidth used on each NUMA node over time. Of course the absolute value will be wrong due to overhead but the relative value...
Remote free might be a source of NUMA mis-placement on many allocators, maybe we can add a counter to extract this info.