Memory Bandwidth and Utilization in Cluster Trace 2018
Hello,
I have a question towards memory usage in cluster trace 2018.
Does mem_gps (normalized memory bandwidth, see the figure below) mean how much percentage of memory bandwidth is used in this machine? For example, if total memory bandwidth of a machine is 100GB/s and mem_gps shows 10%, the current memory bandwidth of this machine is 10GB/s. Is this right?
The problem I find is the memory capacity usage in the cluster trace is pretty high (e.g., 80%), but the mem_gps is always less than 5%. I am not sure whether this is correct. If this is true, it seems like all jobs use large amount of memory, but they actually do not use its bandwidth.
Thanks
PS: here is the schema of cluster trace 2018:
Here is an example. I randomly pick one piece of machine usage data in cluster trace.
Here, you can see that the memory capacity usage is more than 90% (column 4), the memory bandwidth usage, however, is only 3% (column 5).