clusterdata icon indicating copy to clipboard operation
clusterdata copied to clipboard

Memory Bandwidth and Utilization in Cluster Trace 2018

Open yanbozyb opened this issue 1 year ago • 1 comments

Hello,

I have a question towards memory usage in cluster trace 2018.

Does mem_gps (normalized memory bandwidth, see the figure below) mean how much percentage of memory bandwidth is used in this machine? For example, if total memory bandwidth of a machine is 100GB/s and mem_gps shows 10%, the current memory bandwidth of this machine is 10GB/s. Is this right?

The problem I find is the memory capacity usage in the cluster trace is pretty high (e.g., 80%), but the mem_gps is always less than 5%. I am not sure whether this is correct. If this is true, it seems like all jobs use large amount of memory, but they actually do not use its bandwidth.

Thanks

PS: here is the schema of cluster trace 2018: image

yanbozyb avatar Dec 20 '24 18:12 yanbozyb

Here is an example. I randomly pick one piece of machine usage data in cluster trace.

Here, you can see that the memory capacity usage is more than 90% (column 4), the memory bandwidth usage, however, is only 3% (column 5).

image

yanbozyb avatar Dec 20 '24 18:12 yanbozyb