zabbix-nvidia-smi-integration
zabbix-nvidia-smi-integration copied to clipboard
Steps towards a template for AMD based cards
I've been asked by email about how something similar might work for AMD based cards and thought it might be worth developing, so I outline the steps here. In case someone wants to try.
I suspect the main body of the xml template would remain the same. The main changes would be made to the configuration of the Zabbix agent and the commands such as the following would have to change:
UserParameter=gpu.temp,nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader,nounits -i 0
The forum (https://community.amd.com/thread/167544) suggests a similar tool to nvidia-smi would be amdconfig, while another suggests RadeonTop (https://askubuntu.com/questions/244577/temperature-and-other-statistics-from-radeon-open-source-drivers).
The first forum provides several commands such as:
amdconfig --adapter=$1 --odgt | grep 'Temperature' | cut -d'-' -f2 | cut -d'.' -f1 | tr -d ' '
amdconfig --adapter=$1 --odgc | grep 'GPU load' | cut -f1 -d'%' | cut -f2 -d':'| tr -d ' '
This I believe would can be converted into the equivelant commands for the Zabbix template, provided here. i.e.
UserParameter=gpu.temp,amdconfig --adapter=$1 --odgt | grep 'Temperature' | cut -d'-' -f2 | cut -d'.' -f1 | tr -d ' '
UserParameter=gpu.utilisation,amdconfig --adapter=$1 --odgc | grep 'GPU load' | cut -f1 -d'%' | cut -f2 -d':'| tr -d ' '
The two lines above use grep and cut to select the correct part of the output of amdconfig. I purposefully chose to get nvidia-smi in my own commands to limit the output so there was no need to parse the output with text processing commands afterwards. This was the one of the main advances I made over the gist: https://gist.github.com/bhcopeland/b54d3c678a0cb6e87119. Commands such as “grep” and “cut” may led to selecting the wrong bit of data, in cases where say “temperature” was shown on multiple lines of output of the command amdconfig.