node_exporter
node_exporter copied to clipboard
err="no CPU power status has been recorded" on M3
Host operating system: output of uname -a
Darwin MacBook-Pro.local 23.1.0 Darwin Kernel Version 23.1.0: Mon Oct 9 21:32:11 PDT 2023; root:xnu-10002.41.9~7/RELEASE_ARM64_T6030 x86_64
node_exporter version: output of node_exporter --version
node_exporter, version 1.7.0 (branch: HEAD, revision: 7333465abf9efba81876303bb57e6fadb946041b) build user: root@192f292aac5e build date: 20231112-23:56:56 go version: go1.19.12 platform: darwin/amd64 tags: netgo osusergo static_build
node_exporter command line flags
What did you expect to see?
What did you see instead?
--no-collector.thermal disable on M1, M2, M3
dongjiang@MacBook Pro:prometheus-operator $ pmset -g thermlog
Note: No thermal warning level has been recorded
Note: No performance warning level has been recorded
Note: No CPU power status has been recorded
ref: https://github.com/prometheus/node_exporter/issues/2218
I was able to reproduce this issue on master, however, I don't think this bug is within the scope of node_exporter, but instead, needs to be addressed from the IOPMCopyCPUPowerStatus side of things (the underlying API the thermal collector relies on for this data).
It's also worth mentioning here that all recommended practices mentioned in the same documentation are being followed in the repository (such as releasing allocated CFDictionaryRef memory and checking for returned error codes).
A simplified version of the C-based collection logic shows that while node_exporter is responsible for exposing the data, it's the response from the API that is unexpected in nature.
package main
/*
#cgo LDFLAGS: -framework IOKit -framework CoreFoundation
#include <stdio.h>
#include <CoreFoundation/CoreFoundation.h>
#include <IOKit/IOKitLib.h>
#include <IOKit/pwr_mgt/IOPMLib.h>
#include <IOKit/pwr_mgt/IOPM.h>
struct ref_with_ret {
CFDictionaryRef ref;
IOReturn ret;
};
struct ref_with_ret FetchThermal();
struct ref_with_ret FetchThermal() {
CFDictionaryRef ref;
IOReturn ret;
ret = IOPMCopyCPUPowerStatus(&ref);
struct ref_with_ret result = {
ref,
ret,
};
return result;
}
*/
import "C"
import "fmt"
func main() {
result := C.FetchThermal()
/*
kIOPMCPUPowerLimitProcessorSpeedKey: 100, // cpu_scheduler_limit_ratio
kIOPMCPUPowerLimitProcessorCountKey: 4, // cpu_available_cpu
kIOPMCPUPowerLimitSchedulerTimeKey: 100 // cpu_speed_limit_ratio
*/
fmt.Println(result.ref) // 0
fmt.Println(result.ret == C.kIOReturnNotFound) // true
fmt.Println(result.ret != C.kIOReturnSuccess) // true
}
Additionally, ioreg -n IOPMrootDomain -r | grep IOPMCPUPowerLimitProcessorSpeedKey fails as well.