Annotation to indicate whether/how a (Guaranteed QoS class) container prefers high memory bandwidth vs. low memory latency.
It should be possible for a container to indicate whether it is more memory bandwidth or latency critical. Currently CRI-RM implicitly assumes that all containers prefer low memory latency even if that penalizes the available maximum memory bandwidth on the current hardware configuration.
If a container is annotated to prefer more memory bandwidth, then CRI-RM running with SNC enabled could decide to allocate the container to a full die, socket, or event the root of the pool tree.
Maybe a good starting point would be to allow a memoryPreference annotation which indicates where the container fits in the [full-latency optimization - maximum bandwidth] scale. It could indicate 3 [or maybe 5 if we want more expressiveness] preferences:
memory-preference: {minimum-latency (default), [latency-weighted], balanced, [bandwidth-weighted], maximum-bandwidth}
Latency sensitivity would force containers as low in the pool tree as possible (which is also the current/default) behavior. Bandwidth-sensitivity would gradually cause the container to be lifted in the pool tree closed to the root, resulting in getting pinned to an increasing number of memory controllers/NUMA nodes.