DCGM icon indicating copy to clipboard operation
DCGM copied to clipboard

A question about each sampling record

Open FindHao opened this issue 3 years ago • 2 comments

By using API GetAllSinceLastCall_v2, we could obtain more sampling records. I have a question about the records. For example, if we monitor DCGM_FI_PROF_PIPE_FP32_ACTIVE , each record has a timestamp and value. Is the value the max FP32 active ratio in the past duration from the last timestamp to the current timestamp? Or the average FP32 active ratio?

FindHao avatar Jan 11 '23 14:01 FindHao

Profiling metrics are the average over the update interval (the updateFrequency) parameter.

glowkey avatar Jan 12 '23 15:01 glowkey

Profiling metrics are the average over the update interval (the updateFrequency) parameter.

@glowkey Thanks for your reply! I forget to explain my assumptions. I found each record's duration is about 1.5ms if only enable DCGM_FI_PROF_PIPE_FP32_ACTIVE. Does DCGM has a smaller frequency, or is it the minimal frequency? If DCGM has a smaller frequency, it means that each record I see includes a lot of smaller unexposed records. Then the average means the average value among all smaller unexposed records. If it is the minimal frequency, it means the FP32_active is the FP32_active_cycle / all_cycle_of_a_duration.

Could you tell me which one is correct if you know?

FindHao avatar Jan 12 '23 15:01 FindHao