windows_exporter
windows_exporter copied to clipboard
Thermalzone not working
Hi,
My exporter does not expose any metrics regarding the thermalzone. It is enabled however:
windows_exporter_collector_duration_seconds{collector="thermalzone"} 0.0060052
windows_exporter_collector_success{collector="thermalzone"} 1
windows_exporter_collector_timeout{collector="thermalzone"} 0
[System.Environment]::OSVersion.Version
Major Minor Build Revision
----- ----- ----- --------
10 0 19042 0
Exporter version: Starting windows_exporter (version=0.16.0, branch=master, revision=f316d81d50738eb0410b0748c5dcdc6874afe95a)
I run windows exporter with the following arguments:
"C:\Program Files\windows_exporter\windows_exporter.exe" --log.format logger:eventlog?name=windows_exporter --telemetry.addr :9182 --collectors.enabled cpu,cs,logical_disk,logon,memory,net,os,process,service,system,tcp,time,thermalzone,textfile
I'm a Linux engineer, so I have no clue how to troubleshoot something like this. Please advice, thank you!
Are there any relevant logs in the Event Viewer? windows_exporter
will log to Windows Logs -> Application
.
Getting something similar with no thermal data. Looking at the event viewer and filtering for windows_exporter everything is information except 2 which are warnings No filters specified for process collector. This will generate a very large number of metrics!
Looks like all my things are duplicated twice which is why I have 2 warnings with the same message. Everything else seems to be working though just a thermalzone issue.
I am also having this issue. I suspect my hardware does not support the thermalzone collector, but I do not know how to validate this.
Checking if the thermalzone
perflib metrics are present would be a good first step:
# List Counter Sets (confirm if "Thermal Zone Information" CounterSet is present)
Get-Counter -ListSet * | Sort-Object -Property CounterSetName | Select CounterSetName
# List counters for set
Get-Counter -ListSet 'Thermal Zone Information'
# Get a counter from the set
Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
I get an error ("Get-Counter: Internal performance counter API call failed. Error: 800007d1.") when running the last command, but that may be due to my VM not having access to any hardware temperature sensors.
@breed808 thank you for your reply. I run those commands in PowerShell as Administrator.
PS C:\WINDOWS\system32> Get-Counter -ListSet * | Sort-Object -Property CounterSetName | Select CounterSetName
CounterSetName
--------------
.NET CLR Data
.NET CLR Exceptions
.NET CLR Interop
.NET CLR Jit
.NET CLR Loading
.NET CLR LocksAndThreads
.NET CLR Memory
.NET CLR Networking
.NET CLR Networking 4.0.0.0
.NET CLR Remoting
.NET CLR Security
.NET Data Provider for Oracle
.NET Data Provider for SqlServer
.NET Memory Cache 4.0
{115b92b4-7191-491a-a9b5-93c8e9fb641b}
{7d937e49-cfd5-438f-af4f-b3047d90a5c3}
{f3e82f6e-9df4-425d-a5d5-3a9832005b16}
AppV Client Streamed Data Percentage
Authorization Manager Applications
BitLocker
BITS Net Utilization
Bluetooth Device
Bluetooth Radio
BranchCache
Browser
Cache
Client Side Caching
Database
Database ==> Databases
Database ==> Instances
Database ==> TableClasses
Distributed Routing Table
Distributed Transaction Coordinator
DNS64 Global
Energy Meter
Event Log
Event Tracing for Windows
Event Tracing for Windows Session
Fax Service
FileSystem Disk Activity
Generic IKEv1, AuthIP, and IKEv2
GPU Adapter Memory
GPU Engine
GPU Local Adapter Memory
GPU Non Local Adapter Memory
GPU Process Memory
HTTP Service
HTTP Service Request Queues
HTTP Service Url Groups
Hyper-V Dynamic Memory Integration Service
Hyper-V Hypervisor
Hyper-V Hypervisor Logical Processor
Hyper-V Hypervisor Root Partition
Hyper-V Hypervisor Root Virtual Processor
Hyper-V Virtual Machine Bus Pipes
Hyper-V VM Vid Partition
ICMP
ICMPv6
IPHTTPS Global
IPHTTPS Session
IPsec AuthIP IPv4
IPsec AuthIP IPv6
IPsec Connections
IPsec Driver
IPsec IKEv1 IPv4
IPsec IKEv1 IPv6
IPsec IKEv2 IPv4
IPsec IKEv2 IPv6
IPv4
IPv6
Job Object Details
LogicalDisk
Memory
Microsoft Winsock BSP
MSDTC Bridge 3.0.0.0
MSDTC Bridge 4.0.0.0
NBT Connection
Netlogon
Network Adapter
Network Interface
Network QoS Policy
NUMA Node Memory
Objects
Offline Files
Pacer Flow
Pacer Pipe
PacketDirect EC Utilization
PacketDirect Queue Depth
PacketDirect Receive Counters
PacketDirect Receive Filters
PacketDirect Transmit Counters
Paging File
Peer Name Resolution Protocol
Per Processor Network Activity Cycles
Per Processor Network Interface Card Activity
Physical Network Interface Card Activity
PhysicalDisk
Power Meter
PowerShell Workflow
Print Queue
Process
Processor
Processor Information
RAS
RAS Port
RAS Total
RDMA Activity
ReadyBoost Cache
Redirector
ReFS
RemoteFX Graphics
RemoteFX Network
Search Gatherer
Search Gatherer Projects
Search Indexer
Security Per-Process Statistics
Security System-Wide Statistics
Server
Server Work Queues
ServiceModelEndpoint 3.0.0.0
ServiceModelEndpoint 4.0.0.0
ServiceModelOperation 3.0.0.0
ServiceModelOperation 4.0.0.0
ServiceModelService 3.0.0.0
ServiceModelService 4.0.0.0
SMB Client Shares
SMB Direct Connection
SMB Server
SMB Server Sessions
SMB Server Shares
SMSvcHost 3.0.0.0
SMSvcHost 4.0.0.0
Storage Management WSP Spaces Runtime
Storage Spaces Drt
Storage Spaces Tier
Storage Spaces Virtual Disk
Storage Spaces Write Cache
Synchronization
SynchronizationNuma
System
TCPIP Performance Diagnostics
TCPIP Performance Diagnostics (Per-CPU)
TCPv4
TCPv6
Telephony
Teredo Client
Teredo Relay
Teredo Server
Terminal Services
Terminal Services Session
Thermal Zone Information
Thread
UDPv4
UDPv6
USB
User Input Delay per Process
User Input Delay per Session
WF (System.Workflow) 4.0.0.0
WFP
WFP Classify
WFP Reauthorization
WFPv4
WFPv6
Windows Media Player Metadata
Windows Time Service
Windows Workflow Foundation
WinNAT
WinNAT ICMP
WinNAT Instance
WinNAT TCP
WinNAT UDP
WMI Objects
WorkflowServiceHost 4.0.0.0
WSMan Quota Statistics
XHCI CommonBuffer
XHCI Interrupter
XHCI TransferRing
PS C:\WINDOWS\system32> Get-Counter -ListSet 'Thermal Zone Information'
CounterSetName : Thermal Zone Information
MachineName : .
CounterSetType : SingleInstance
Description : The Thermal Zone Information performance counter set consists of counters that measure aspects of each thermal zone in the system.
Paths : {\Thermal Zone Information(*)\High Precision Temperature, \Thermal Zone Information(*)\Throttle Reasons, \Thermal Zone Information(*)\% Passive Limit, \Thermal Zone Information(*)\Temperature}
PathsWithInstances : {\Thermal Zone Information(*)\High Precision Temperature, \Thermal Zone Information(*)\Throttle Reasons, \Thermal Zone Information(*)\% Passive Limit, \Thermal Zone Information(*)\Temperature}
Counter : {\Thermal Zone Information(*)\High Precision Temperature, \Thermal Zone Information(*)\Throttle Reasons, \Thermal Zone Information(*)\% Passive Limit, \Thermal Zone Information(*)\Temperature}
PS C:\WINDOWS\system32> Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
Get-Counter : The specified instance is not present.
At line:1 char:1
+ Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidResult: (:) [Get-Counter], Exception
+ FullyQualifiedErrorId : CounterApiError,Microsoft.PowerShell.Commands.GetCounterCommand
This is on a physical PC.
Please let me know what other information I can provide.
Strange, some searching indicates that this error is returned when not running the query as Administrator :confused:
Are you able to query any of the other counters, such as High Precision Temperature
? It'd also be worth checking the Performance Monitor to see if any Thermal Zone Information metrics are exposed their.
Hi @breed808. Thanks for the quick reply, appreciate it!
I am unable to get any of the other counters in PowerShell, ran as Administrator.
I am unable to get them in Performance Monitor either. So it seems it's a Windows problem.
So I checked Event Viewer and found 5 Warnings from the source PerfProc
:
Unable to open the job object \BaseNamedObjects\WmiProviderSubSystemHostJob for query access. The calling process may not have permission to open this job. The first four bytes (DWORD) of the Data section contains the status code.
I ran Performance monitor again as administrator, hoping it would help, but it didn't. Any suggestions?
EDIT:
I found this article: https://www.tenforums.com/general-support/136109-error-event-1020-perflib-win-10-1903-a.html
It says to run C:\WINDOWS\SysWOW64> Lodctr /R
, which I did, twice as the first time resulted in an error.
A new event was logged however:
The Open procedure for service ".NETFramework" in DLL "C:\WINDOWS\system32\mscoree.dll" failed with error code The system cannot find the file specified.. Performance data for this service will not be available.
I tried to install https://dotnet.microsoft.com/download/dotnet-framework/net48 as suggested by Google, but it already says that it's installed. So not sure what to install for that specific .dll file, but I think it's related...
I've done some more searching and there's mention of repairing the .NET Framework installation to install the missing mscoree.dll file. Microsoft host a .NET Framework repair tool here: https://www.microsoft.com/en-gb/download/details.aspx?id=30135. I'm not sure how helpful it will be though.
I ran the tool, and tried to run the .NET Framework installer again as said in the tool. Unfortunately it didn't fix the Performance monitor, not even after a reboot.
Stupidly enough, I never checked if mscoree.dll
was ever there, but it is now, unfortunately no luck..
Checking if the
thermalzone
perflib metrics are present would be a good first step:# List Counter Sets (confirm if "Thermal Zone Information" CounterSet is present) Get-Counter -ListSet * | Sort-Object -Property CounterSetName | Select CounterSetName # List counters for set Get-Counter -ListSet 'Thermal Zone Information' # Get a counter from the set Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
I get an error ("Get-Counter: Internal performance counter API call failed. Error: 800007d1.") when running the last command, but that may be due to my VM not having access to any hardware temperature sensors.
I also get the same error on the final command - but I am not running the commands from a VM:
PS C:\Windows\system32> Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
Get-Counter : Internal performance counter API call failed. Error: 800007d1.
At line:1 char:1
+ Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidResult: (:) [Get-Counter], Exception
+ FullyQualifiedErrorId : CounterApiError,Microsoft.PowerShell.Commands.GetCounterCommand
The first two commands run without error.
I'm seeing similar results; running Powershell as Administrator:
PS C:\> Get-Counter -ListSet 'Thermal Zone Information'
CounterSetName : Thermal Zone Information
MachineName : .
CounterSetType : SingleInstance
Description : The Thermal Zone Information performance counter set consists of counters that measure aspects of
each thermal zone in the system.
Paths : {\Thermal Zone Information(*)\High Precision Temperature, \Thermal Zone Information(*)\Throttle
Reasons, \Thermal Zone Information(*)\% Passive Limit, \Thermal Zone Information(*)\Temperature}
PathsWithInstances : {\Thermal Zone Information(*)\High Precision Temperature, \Thermal Zone Information(*)\Throttle
Reasons, \Thermal Zone Information(*)\% Passive Limit, \Thermal Zone Information(*)\Temperature}
Counter : {\Thermal Zone Information(*)\High Precision Temperature, \Thermal Zone Information(*)\Throttle
Reasons, \Thermal Zone Information(*)\% Passive Limit, \Thermal Zone Information(*)\Temperature}
PS C:\> Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
Get-Counter : The specified instance is not present.
At line:1 char:1
+ Get-Counter -Counter '\Thermal Zone Information(*)\Temperature'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidResult: (:) [Get-Counter], Exception
+ FullyQualifiedErrorId : CounterApiError,Microsoft.PowerShell.Commands.GetCounterCommand
PS C:\> [System.Environment]::OSVersion.Version
Platform ServicePack Version VersionString
-------- ----------- ------- -------------
Win32NT 10.0.19042.0 Microsoft Windows NT 10.0.19042.0
This is running on Windows Server 20H2, on bare metal with almost nothing else installed or configured. Using windows_exporter v0.16.0.
CPU: AMD Threadripper 3960X
I think a separate yet related issue here is that windows_exporter_collector_success{collector="thermalzone"} 1
should be 0.
Apologies all, I've checked the thermalzone
collector to fix the windows_exporter_collector_success
metric, and noted the collector is actually using WMI as the metric source. So the Get-Counter
commands may have been a waste of time :disappointed:
Could you run the following and see if any output is returned?
Get-CimInstance -Classname Win32_PerfRawData_Counters_ThermalZoneInformation
I've run this on my testing VM but have received no output or error.
Same here @breed808
PS C:\Windows\system32> Get-CimInstance -Classname Win32_PerfRawData_Counters_ThermalZoneInformation
>>
PS C:\Windows\system32> Get-CimInstance -Classname Win32_PerfRawData_Counters_ThermalZoneInformation
>>
PS C:\Windows\system32>
It looks almost like it expects something extra.
@breed808 Any update/suggestions on this? I don't mind joining an IRC or something so we can troubleshoot this faster if you'd like.
@Ramshield I don't mind supporting over IRC, but I'm not sure if I can be of much more help here. I think we need someone with more ThermalZone experience, as there seems to be a some prerequisite missing here.
@breed808 Anyone we can mention who might be able to help? :)
It's been a few years since I looked at this last time, but from what I recall, the ThermalZone data was very finicky, and requires some driver support which we never managed to pin down exactly what was supposed to provide... The root issue seemed to be that there's actually no unified API for this, so the conclusion at the time was that it'd be a lot of work to implement this in any other way. If there are suggestions for how to achieve this though, I think we'd be very happy to replace the current implementation!
Is there any way to take a look at for example Open hardware monitor for inspiration, at the least? Perhaps they are open for discussion for advice!
There was some work on reusing OHM in #727, but it stalled on a mix of licensing issues and whether it was a good integration pattern.
I am running it on a German system and it seems it cannot collect data as I have to run the following command to get the relevant data Get-Counter -ListSet 'Thermozoneninformationen'
. Any ideas on how to deal with non-English systems?
There was some work on reusing OHM in #727, but it stalled on a mix of licensing issues and whether it was a good integration pattern.
Maybe Open Hardware Monitor is a solution. It exposes it's readings to WMI and it's unter the MPL 2.0 license.
http://openhardwaremonitor.org/wordpress/wp-content/uploads/2011/04/OpenHardwareMonitor-WMI.pdf
It seems that it can be interfaced with it's DLL.
https://stackoverflow.com/questions/3262603/accessing-cpu-temperature-in-python
I am running it on a German system and it seems it cannot collect data as I have to run the following command to get the relevant data
Get-Counter -ListSet 'Thermozoneninformationen'
. Any ideas on how to deal with non-English systems?
I'm facing the same issue, in my case it's in spanisht and it seems it cant get temperature values to pass them. '\Información sobre la zona térmica(*)\Temperatura'
The translated ListSet names dont't match the English name in the collector.
From the previous reports I've seen on this issue, not all ListSets have translation problems (or are not translated). It's something we should address at some stage, else we're excluding entire localizations from running the exporter.
I was likely on American English when I tried originally and it wasn't working for me.
Yes, there's two issues with the collector that have been raised in this thread:
- Unknown dependency preventing
thermalzone
collector and Perflib commands from returning metrics - Translated name of Thermalzone ListSet preventing collector from working correctly on non-English locales.
Users in this thread are largely experiencing 1), but 2) is also a problem.
adding my "me too" here as well. German installation of MS Windows Server 2019.
Same here (German, empty results set), I think we have a clear pattern
Thermalzone not working for some reason tested the collector windows_exporter_collector_success{collector="thermalzone"} which is 0 , it is possible that these are vendor specific classes that aren't always available on all systems. therefor we should enumerate the classes if they are like thermal or temp.
if we do this in powershell we get to see some more Get-CimClass -Namespace root/cimv2 | Where-Object {$.CimClassName -like "Temp" -or $.CimClassName -like "Thermal" -or $_.CimClassName -like "Cooling"}
i also found out there are all zero
even if i try to see this it gives nothing
So its is surely possible that this information is behind specific vendor classes.
i did some more research on this it depends on the hardware some hardware isent supported but provide monitoring tools which can be used to enumerate CPU temperatures so recommendation is to build it as a custom metric, as example for dell you can use Dell Command | Monitor and maybe schedule a task to update the metrics to a textfile as a workaround.
We also plan an collectors which allows to scrape any perfdata based counters.