windows_exporter icon indicating copy to clipboard operation
windows_exporter copied to clipboard

collector process failed - counter not found on Windows Server 2022

Open Mario-Hofstaetter opened this issue 1 year ago • 13 comments

Using version 0.24.0 on a new Windows Server 2022 machine, the exporter service was not starting. Investigating logs this error is found:

... caller=prometheus.go:188 level=error msg="collector process failed after 0.000000s" err="counter not found"
... caller=prometheus.go:188 level=error msg="collector process failed after 0.000000s" err="counter not found"

Related issue found:

  • #580

I tried running

lodctr /R

as originally posted by @carlpett in https://github.com/prometheus-community/windows_exporter/issues/580#issuecomment-670947646

But that did not resolve the issue (I have not yet tried rebooting). We use these collectors by default:

enabled: "cpu,cs,logical_disk,memory,net,os,process,service,system,tcp,textfile,time"

Enabling them one by one, only process seems to have an issue.

Looking into perfmon.exe I noticed the following:

On the machine with the error, process performance counters are listed under category Process V2. An ok machine (Windows Server 2019) displays them under Process.

Has there been an breaking change by Microsoft?

Server-2019 Server-2022

Edit: Querying using powershell, it seems the Process Counter has been renamed ?!


# Windows Server 2019 (1809 Build 17763.4974)

PS C:\> Get-Counter -ListSet Proces* | Format-Table

CounterSetName        MachineName CounterSetType Description Paths
--------------        ----------- -------------- ----------- -----
Processor Information .            MultiInstance             {\Processor Information(*)\Performance Limit Flags, \Processor Information(*)\% Performance Limit, \Processor …
Processor             .            MultiInstance             {\Processor(*)\% Processor Time, \Processor(*)\% User Time, \Processor(*)\% Privileged Time, \Processor(*)\Int…
Process               .            MultiInstance             {\Process(*)\% Processor Time, \Process(*)\% User Time, \Process(*)\% Privileged Time, \Process(*)\Virtual Byt…

# Windows Server 2022 (21H2 Build 20348.2031)

PS C:\> Get-Counter -ListSet Proces* | Format-Table

CounterSetName        MachineName CounterSetType Description Paths
--------------        ----------- -------------- ----------- -----
Process V2            .            MultiInstance             {\Process V2(*)\Working Set - Private, \Process V2(*)\IO Other Bytes/sec, \Process V2(*)\IO Data Bytes/sec, …
Processor Information .            MultiInstance             {\Processor Information(*)\Performance Limit Flags, \Processor Information(*)\% Performance Limit, \Processo…
Processor             .            MultiInstance             {\Processor(*)\% Processor Time, \Processor(*)\% User Time, \Processor(*)\% Privileged Time, \Processor(*)\I…

Mario-Hofstaetter avatar Oct 23 '23 16:10 Mario-Hofstaetter

I assume that one of the required counters is disabled.

Can you try this command (admin mode) and post the result : lodctr.exe /Q | findstr /i "disable"

SupraOva avatar Oct 24 '23 08:10 SupraOva

I assume that one of the required counters is disabled.

Can you try this command (admin mode) and post the result : lodctr.exe /Q | findstr /i "disable"

PS C:\> lodctr.exe /Q | findstr /i "disable"
[Lsa] Performance Counters (Disabled)
[PerfProc] Performance Counters (Disabled)
PS C:\>

So it seems something is disabled, does this include the required Process counters for the collector?

Mario-Hofstaetter avatar Oct 24 '23 08:10 Mario-Hofstaetter

Yes, so now try to activate these counters by running the command below and check if you still have any errors.

lodctr.exe /E:Lsa
lodctr.exe /E:PerfProc

SupraOva avatar Oct 24 '23 08:10 SupraOva

Yes, so now try to activate these counters by running the command below and check if you still have any errors.

lodctr.exe /E:Lsa
lodctr.exe /E:PerfProc

Thank you very much ❤️ @SupraOva , that does seem to do the trick. I followed those command with lodctr.exe /R , not sure if that was necessary, but the process collector is now able to return metrics again.

Where do we go from here, I guess this should be put at least into the process collector docs? Currently I don't have time for a quick PR, my apologies.

The setup installer cannot run this automatically, because not everybody uses the process collector though (for us, it is one of the most important ones).

Mario-Hofstaetter avatar Oct 24 '23 08:10 Mario-Hofstaetter

Found the information there :

For backwards-compatibility reasons, the "Process" counterset returns non-unique instance names based on the EXE filename. This can cause confusing results, especially when a process with a non-unique name starts up or shuts down, as this will typically result in data glitches due to incorrect matching of instance names between samples. Consumers of the "Process" counterset must be able to tolerate these non-unique instance names and the resulting data glitches. In Windows 11 and later, you can use the Process V2 counterset to avoid this problem.

It might be interesting to use Process V2 if availeable as Process is disabled by default no?

JDA88 avatar Jan 11 '24 15:01 JDA88

It might be interesting to use Process V2 if availeable as Process is disabled by default no?

image

The issue here that windows_exporter is using Registry API that the moment which not support any V2 providers.

See also: https://github.com/prometheus-community/windows_exporter/issues/1350

jkroepke avatar Jan 16 '24 17:01 jkroepke

It might be interesting to use Process V2 if availeable as Process is disabled by default no?

image

The issue here that windows_exporter is using Registry API that the moment which not support any V2 providers.

Ho, didn't know that, but it answers the question 😅

JDA88 avatar Jan 16 '24 18:01 JDA88

This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.

github-actions[bot] avatar Apr 16 '24 02:04 github-actions[bot]

For me the above Workaround didnt fix it. The same setup works in 2019. I came here via grafana alloy: https://github.com/grafana/alloy/issues/658

V1TA5 avatar Apr 24 '24 11:04 V1TA5

I'm currently working on a new collector which is used the windows performance data libraries. They are able to use v2 counters.

But I expect that it might take weeks develop the new feature.

jkroepke avatar Apr 24 '24 12:04 jkroepke

Will this collector be commited to prometheus or its seperate thing? Im Currently in the testing phase for a monitoring setup so ive some time till i need to decide on a solution.

V1TA5 avatar May 03 '24 06:05 V1TA5

here - https://github.com/prometheus-community/windows_exporter/pull/1459

The PR is not fully completed yet.

jkroepke avatar May 03 '24 07:05 jkroepke

Had a similar issue on Windows 10. None of os,cpu,memory,system,logical_disk etc. worked. Another symptom was getting a "Unable to add these counters" message when starting PerfMon.

Rebuilding the counters exactly as instructed here (Administrator prompt, cd to correct directories) solved it for me.

rossi-fi avatar May 04 '24 22:05 rossi-fi