DCGM
DCGM copied to clipboard
Old data are copied into new data in dcgmGroupSamples.GetAllSinceLastCall
Hi,
I am using the python bindings of DCGM 3.2.6.
When I run my python script which includes GetAllSinceLastCall
in DcgmGroupSamples
(here), it seems the old data (stored in cache?) from previous python runs are also copied to the values
.
This is the first time I ran the python script and print out the field values:
(xformers) yangyang22@workers-st-p4de-107:/mnt/fsx-home/yangyang22/projects/xformers/xformers/profiler$ python3 dcgm_example.py -p 1158114
Connecting to a standalone hostengine with auto opmode...
time instance|value: 0|0.47234506497546147
time instance|value: 1|0.47234506497546147
time instance|value: 2|0.4753820541153355
time instance|value: 3|0.47537324481855675
time instance|value: 4|0.4753905944019482
time instance|value: 5|0.4753868239508041
time instance|value: 6|0.4754042738288565
Now I ran the same python script again, the old field values from the previous python script are also there (time instances 1-6 above are now as same as time instances 0-5 below):
time instance|value: 0|0.47234506497546147
time instance|value: 1|0.4753820541153355
time instance|value: 2|0.47537324481855675
time instance|value: 3|0.4753905944019482
time instance|value: 4|0.4753868239508041
time instance|value: 5|0.4754042738288565
time instance|value: 6|0.4969533182061546
time instance|value: 7|0.4969533182061546
time instance|value: 8|0.4753617928144854
time instance|value: 9|0.4753946481315652
time instance|value: 10|0.4753674762672508
time instance|value: 11|0.4753788155863566
time instance|value: 12|0.4754120235068984
The code is as follows:
dcgmFieldGroup = pydcgm.DcgmFieldGroup(dcgmHandle, name="Profiling", fieldIds=[1004])
dcgmGroupSamples = pydcgm.DcgmGroupSamples(dcgmHandle, dcgmGroup.GetId(), dcgmGroup)
dcgmGroupSamples.WatchFields(dcgmFieldGroup, 1000000, 3600, 0)
# collect profiling results run in background
profiling_results = dcgmGroupSamples.GetAllSinceLastCall(None, dcgmFieldGroup)
# replace this by the code that should be profiled
time.sleep(5)
# collect profiling results
dcgmGroupSamples.GetAllSinceLastCall(profiling_results, dcgmFieldGroup)
# print profiling results
for gpu_id in profiling_results.values.keys():
for field_id in profiling_results.values[gpu_id].keys():
for time_instance, gpu_field_time in enumerate(profiling_results.values[gpu_id][field_id]):
print(f"time instance|value: {time_instance}|{gpu_field_time.value}")
What is the proper way to get the data just from the current run? Thanks a lot.