laser icon indicating copy to clipboard operation
laser copied to clipboard

NUMA-aware memory allocation and computation

Open mratsim opened this issue 6 years ago • 0 comments

Most HPC system have more than 1 socket which poses quite a problem to many parallel libraries.

Even in OpenMP 4, distributing parallel compute to socket proc_bind(spread) and within sockets to actual core (so no hyperthreading before all core are used) was quite an ordeal: DeepinScreenshot_select-area_20190713005519

OpenMP 5.0 brings Numa aware allocator, see https://techdecoded.intel.io/essentials/openmp-5-0-a-story-about-threads-and-tasks/ (35min in) DeepinScreenshot_select-area_20190713005653

mratsim avatar Jul 12 '19 22:07 mratsim