ansible-slurm-appliance icon indicating copy to clipboard operation
ansible-slurm-appliance copied to clipboard

No cpu frequency information in grafana

Open sjpb opened this issue 4 years ago • 4 comments

Ticket: https://stackhpc.atlassian.net/browse/DEV-1017

Looks like cpu and cpufreq are already in environments/common/inventory/group_vars/all/prometheus.yml though.

sjpb avatar Sep 30 '21 09:09 sjpb

Seen on alaska (on arcus), reproduced on smslabs

sjpb avatar Sep 30 '21 09:09 sjpb

Reproduced on arcus. This kernel module doesn't exist:

[root@dev-compute-0 rocky]# ls /lib/modules/$(uname -r)/kernel/arch/x86/kernel/cpu/cpufreq/

tried

 yum install cpupowerutils
[root@dev-compute-0 rocky]# find /lib/modules -type f -iname "*freq*"
/lib/modules/4.18.0-348.el8.0.2.x86_64/kernel/drivers/cpufreq/acpi-cpufreq.ko.xz
/lib/modules/4.18.0-348.el8.0.2.x86_64/kernel/drivers/cpufreq/amd_freq_sensitivity.ko.xz
/lib/modules/4.18.0-348.23.1.el8_5.x86_64/kernel/drivers/cpufreq/acpi-cpufreq.ko.xz
/lib/modules/4.18.0-348.23.1.el8_5.x86_64/kernel/drivers/cpufreq/amd_freq_sensitivity.ko.xz
[root@dev-compute-0 rocky]# modprobe acpi-cpufreq
modprobe: ERROR: could not insert 'acpi_cpufreq': No such device

sjpb avatar May 26 '22 10:05 sjpb

ohpc dashboard uses node_cpu_scaling_frequency_hertz stats

Based on https://superuser.com/questions/1624080/why-there-is-no-cpufreq-under-sys-devices-system-cpu-cpu0

[rocky@cpuinfo-compute-0 ~]$ curl http://localhost:9100/metrics | grep node_cpu_scaling_frequency_hertz
<nothing>

[rocky@cpuinfo-compute-0 ~]$ cat /boot/config-$(uname -r) | grep CONFIG_CPU_FREQ
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y

[rocky@cpuinfo-compute-0 ~]$ cat /boot/config-$(uname -r) | grep CONFIG_X86_ACPI_CPUFREQ
CONFIG_X86_ACPI_CPUFREQ=m
CONFIG_X86_ACPI_CPUFREQ_CPB=y

[rocky@cpuinfo-compute-0 ~]$ cat /boot/config-$(uname -r) | grep CONFIG_X86_INTEL_PSTATE
CONFIG_X86_INTEL_PSTATE=y

sjpb avatar Oct 06 '22 13:10 sjpb

See https://wiki.stackhpc.com/doc/cpu-frequency-RI0IAojfQ7 for more debugging.

sjpb avatar Oct 11 '22 14:10 sjpb