hwloc
hwloc copied to clipboard
Selecting default and several nodes with --best-memattr
What version of hwloc are you using?
2.10.0
Which operating system and hardware are you running on?
RHEL 8; Linux 4.18
Details of the problem
Hello,
I am looking to allocate memory to the nodes showing best "attribute" among
$ lstopo --memattrs
Memory attribute #0 name `Capacity' flags 1
NUMANode L#0 = 33613537280
NUMANode L#1 = 33813360640
NUMANode L#2 = 33813364736
NUMANode L#3 = 33800777728
NUMANode L#4 = 33813364736
NUMANode L#5 = 33813360640
NUMANode L#6 = 33767444480
NUMANode L#7 = 33802883072
Memory attribute #1 name `Locality' flags 2
NUMANode L#0 = 32
NUMANode L#1 = 32
NUMANode L#2 = 32
NUMANode L#3 = 32
NUMANode L#4 = 32
NUMANode L#5 = 32
NUMANode L#6 = 32
NUMANode L#7 = 32
Memory attribute #2 name `Bandwidth' flags 5
Memory attribute #4 name `ReadBandwidth' flags 5
Memory attribute #5 name `WriteBandwidth' flags 5
Memory attribute #3 name `Latency' flags 6
Memory attribute #6 name `ReadLatency' flags 6
Memory attribute #7 name `WriteLatency' flags 6
This causes hwloc-calc to report no best memory for these attributes :
# No memory reported with attribute has no value
$ hwloc-calc --oo --local-memory --best-memattr Latency machine:0
# Working fine when attribute has value :
$ hwloc-calc --oo --local-memory --best-memattr Capacity machine:0
NUMANode:2
When I could like to print the firtst one (or best, all. see below).
Also, when all nodes have the same value for a given attribute, this command only returns the first one.
# Working fine when attribute has value :
$ hwloc-calc --oo --local-memory --best-memattr Localilty socket:0
NUMANode:0
When actually they are all best memory.
This is asking 2 things:
- When no attribute is available, could we have a default, with all nodes having the same value, so that hwloc-cacl answers something ?
- Can we have a mode or new flag that would make
--best-memattr
answer a list of nodes whenever they have the same value ?
Best.
Hello.
In the case where you say "they are all best memory", we're in the case of --local-memory
. How many nodes are actually local to "socket:0" here? only NUMAnode:0 or also another one?
Answering a list of nodes is certainly possible. The current calc option is based on the API that returns a single best one, but extending it is possible, but it will be an additional option such as --multiple-best
.
Once we have that, returning all nodes if they have the same non-existing value should be easy too.
Ops, forgot the topology. It's a bisocket machine, 4 NUMA per socket :
Here's a proposal for hwloc-calc (there's no change in the API yet, although I initially thought it would be strictly required).
On a SPR+HBM machine in SNC-4, we now return 4 local HBMs when askling for best bandwidth nodes near an entire socket:
$ hwloc-calc --local-memory --best-memattr bandwidth socket:1 --oo --sep " "
NUMANode:9 NUMANode:11 NUMANode:13 NUMANode:15
Previous releases returned nothing, and this behavior can still be obtained by adding a strict parameter --best-memattr bandwidth,strict
which means only return memory targets whose best initiator contains the input one.
There's also a default flag to return all nodes if no best is found. For instance on my laptop:
$ hwloc-calc --local-memory --best-memattr bandwidth socket:0 --oo
$ hwloc-calc --local-memory --best-memattr bandwidth,default socket:0 --oo
NUMANode:0
If that answers your need, I'll cleanup and document all this before preparing a PR.
Looks like it is indeed answering my needs 👍
Tarball should be available for testing at https://ci.inria.fr/hwloc/job/basic/job/PR-657/ soon
Fixed in upcoming 2.11, thanks for the report.
I am posting 2.11rc1 right now with this fix.