clusterdata icon indicating copy to clipboard operation
clusterdata copied to clipboard

About memory usage of machine_usage.csv

Open odingzx opened this issue 6 years ago • 4 comments

After anlalyzing the machine_usage.csv, I found that about 50% of machine memory are used neither by instance nor container, for example: machine_id = 'm_2824' , time_stamp = 461830, all instance used memory about 2, all instance used about 41, and the machine used at that timestamp is 91, the same situation is appeared in machine_id = 'm_1825', time_stamp = 691000. I wonder if the machine system and the management of container and VM costs the last 50% of memory? Thanks

odingzx avatar Apr 16 '19 08:04 odingzx

After anlalyzing the machine_usage.csv, I found that about 50% of machine memory are used neither by instance nor container, for example: machine_id = 'm_2824' , time_stamp = 461830, all instance used memory about 2, all instance used about 41, and the machine used at that timestamp is 91, the same situation is appeared in machine_id = 'm_1825', time_stamp = 691000. I wonder if the machine system and the management of container and VM costs the last 50% of memory? Thanks

Hi,

  1. For batch tasks, the resource usage is calculated by sum(sample_usage) / Number_of_samples, while the usage of machine is an instantaneous value. Meanwhile failed tasks are not logged for resource usage.
  2. For online contains, 2 does seem to be abnormal data. Maybe the data is wrong or something is wrong with the machine.
  3. Caching can also take up a lot of memory on the machine.

If this is an individual phenomenon, take it easy. Hope this would help

ChangZihao avatar Apr 23 '19 07:04 ChangZihao

Hi, thanks for your explaination!

  1. The usage of machine is an instantaneous value, and it's the sum of the resource usage of batch instances and containers running on it,like this 绘图1
  2. It's my mistake that memory used by all container is about 41 and used by all instance is about 2,because I find there are many 0 and null in batch_instance.csv, coloum = 'mem_avg' and 'mem_max'
  3. This is not a individual phenomenon, maybe all of them caused by the 0 and null in batch_instance.csv?

odingzx avatar Apr 25 '19 04:04 odingzx

One more question, caching not include in task or container consumption? Thanks!

odingzx avatar Apr 25 '19 04:04 odingzx

Hi, thanks for your explaination!

  1. The usage of machine is an instantaneous value, and it's the sum of the resource usage of batch instances and containers running on it,like this 绘图1
  2. It's my mistake that memory used by all container is about 41 and used by all instance is about 2,because I find there are many 0 and null in batch_instance.csv, coloum = 'mem_avg' and 'mem_max'
  3. This is not a individual phenomenon, maybe all of them caused by the 0 and null in batch_instance.csv?

Hi, For your questions 1 , 2 & 3, you can find the answer in my first reply(1.).

After you correct your problem(container : 41 & batch : 2). The utilization of batch tasks is the key:

  1. Failed tasks are not logged for resource usage (why many 0 and null)
  2. In batch_instance.csv, coloum = 'mem_avg' and 'mem_max' are not instantaneous value. Take mem_avg for example, mem_avg = sum(sample_usage) / Number_of_samples. machine_mem(instantaneous ) != container(instantaneous ) + batch(avg).
  3. Make sure you are using the correct method to count the utilization of batch task.

FYI :)

ChangZihao avatar Apr 28 '19 06:04 ChangZihao