windows_exporter icon indicating copy to clipboard operation
windows_exporter copied to clipboard

windows_exporter_collector_duration_seconds unexpectedly high on new installation

Open rc5hack opened this issue 2 years ago • 2 comments

I have a Windows Server 2019 Std with windows_exporter 0.23.1. It is running for months and it is ok:

windows_exporter_collector_duration_seconds{collector="cpu"} 0.0031461
windows_exporter_collector_duration_seconds{collector="cpu_info"} 0.0078951
windows_exporter_collector_duration_seconds{collector="cs"} 0.0005152
windows_exporter_collector_duration_seconds{collector="logical_disk"} 0.0015612
windows_exporter_collector_duration_seconds{collector="net"} 0.0005152
windows_exporter_collector_duration_seconds{collector="os"} 0.0031461
windows_exporter_collector_duration_seconds{collector="process"} 0.2079785
windows_exporter_collector_duration_seconds{collector="service"} 0.1946591
windows_exporter_collector_duration_seconds{collector="system"} 0.0005152

I have installed fresh new Windows Server 2019 Std with the same windows_exporter 0.23.1 with the same config. It has unexpectedly high collector duration for service and process:

windows_exporter_collector_duration_seconds{collector="cpu"} 0
windows_exporter_collector_duration_seconds{collector="cpu_info"} 0.004378
windows_exporter_collector_duration_seconds{collector="cs"} 3.63e-05
windows_exporter_collector_duration_seconds{collector="logical_disk"} 0
windows_exporter_collector_duration_seconds{collector="net"} 3.63e-05
windows_exporter_collector_duration_seconds{collector="os"} 0.0006733
windows_exporter_collector_duration_seconds{collector="process"} 1.1349282
windows_exporter_collector_duration_seconds{collector="service"} 1.1302589
windows_exporter_collector_duration_seconds{collector="system"} 0

Should I worry of it? What could be the reason for such a high collector duration? How to debug that?

rc5hack avatar Nov 05 '23 09:11 rc5hack

Both collectors mentioned use WMI as a metric source; it's possible the collector is spending time waiting for WMI. You might have some success identifying this with a flamegraph if:

a) The WMI calls are captured by pprof and b) The WMI calls are on-cpu

You can gather the profiling data by running go tool pprof -raw -output=cpu.txt 'http://localhost:9182/debug/pprof/profile?seconds=20' (changing localhost to the hostname of the machine running the exporter and seconds=20 to your desired duration), and then process with the FlameGraph tools: ./stackcollapse-go.pl cpu.txt | flamegraph.pl > flame.svg.

There may be Windows-specific tools more suited for detecting collector delays.

breed808 avatar Jan 09 '24 19:01 breed808

This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.

github-actions[bot] avatar Apr 09 '24 02:04 github-actions[bot]