hwloc
hwloc copied to clipboard
Use linux memory tiers information?
Since Linux 6.1, the kernel may demote/promote memory from/to fast/slow memory (if /sys/kernel/mm/numa/demotion_enabled is true). This uses tiers defined in /sys/devices/virtual/memory_tiering. On a machine with 4 DRAM nodes and 2 PMEM nodes, you'll see one tier with DRAM numa nodes and one with PMEM numa nodes:
/sys/devices/virtual/memory_tiering$ ls
memory_tier22/ memory_tier4/ power/ uevent
/sys/devices/virtual/memory_tiering$ cat memory_tier*/nodelist
4-5
0-3
This could help identify PMEM/DRAM/HBM nodes in hwloc, but it's not clear yet if the kernel uses some info that hwloc doesn't have. For now (6.5), it seems to just use a static "abstract distance" that is 576 for DRAM (tier4 is 576/128=4) and 2280 for DAX KMEM (tier22 is 2280/128=22) so far. DAX KMEM for HBM should have a different value but it doesn't seem to be the case yet.
Memory types are registered with init_node_memory_type(). In mm/memory-tiers.c, all nodes are set to DRAM using
__init_node_memory_type(node, default_dram_type);
In drivers/dax/kmem.c, NVM (likely HBM/SPM too) are set to slow tier:
#define MEMTIER_DEFAULT_DAX_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 5)
...
dax_slowmem_type = alloc_memory_type(MEMTIER_DEFAULT_DAX_ADISTANCE);
...
init_node_memory_type(numa_node, dax_slowmem_type);