icinga-powershell-framework icon indicating copy to clipboard operation
icinga-powershell-framework copied to clipboard

ThresholdInterval not working with Framework 1.13.2

Open geotekberlin opened this issue 9 months ago • 1 comments

The ThresholdInterval option fails on Invoke-IcingaCheckCPU commands on a fresh Powershell 1.13.2 Framework installation with 1.13 Plugins on a Windows Server 2016 host. Only the CheckCPU command fails, all other check commands work as expected.

Image

I have read here that metrics over time are no longer written but monitoring by using the "-ThresholdInterval" argument is still possible.

TestIcingaForWindows shows the following result:

Image

The shown errors are expected since we are not using the Icinga Service.

Any help would be appreciated.

geotekberlin avatar Mar 14 '25 13:03 geotekberlin

Hello

Thank you for the issue. If the Icinga for Windows service is not installed, you cannot use the Metrics over Time feature, as it requires the Icinga for Windows service to run checks frequently on the background and calculate the averages for the defined time frames.

LordHepipud avatar Mar 27 '25 15:03 LordHepipud

I installed the Icinga for Windows service, but the error remains: [Failed to parse metrics over time with -ThresholdInterval "5m": No data found matching the requested time index. Available indexes: []]

Configuring REST-API doesn't help either, and Icinga for Windows appears to be healthy: Image

I just confirmed this issue on another freshly installed Host. Is there any documentation what is needed to keep IcingaCheckCPU working in Framework 1.131.2?

geotekberlin avatar Apr 01 '25 16:04 geotekberlin

Hi @LordHepipud,

I noticed a similiar Issue when I upgrade from Framework 1.12.3 to 1.13.2. When installing the first Check cannot get the Threshold Interval because the index is missing. We configured the Background Daemon to run every 5 minutes.

After 5 Minutes the indizes are added and the Check is in OK State again. From my view it seems, that in previous Versions the Background Daemon executed the checks right away when started. Since 1.13.2 it seems, that you have to wait the Check interval until the first Check is executed.

best regards, Brian

bfenda avatar Apr 04 '25 10:04 bfenda

Hello

Thank you for the input. This makes sense, as you are correct - the new implementation is waiting for the interval to pass before running a check. We will add a "force check" as fix, in case no data is present after the daemon starts to resolve this.

Thanks for the feedback!

LordHepipud avatar Apr 11 '25 10:04 LordHepipud

Hi @LordHepipud,

Thank you for the contribution. A little question from my side. The milestones for 1.13.3 seems complete. Do you have an estimated date for the release?

bfenda avatar May 05 '25 14:05 bfenda

Heya, we are planing to release the new version this week. We are still running some tests to ensure the new version did not introduce other bugs, but are confident that everything will go smoothly. if nothing goes wrong, you can expect a release on Thursday at latest.

LordHepipud avatar May 05 '25 14:05 LordHepipud

With the new plugins and framework the error still persists:

[UNKNOWN] CPU Load [UNKNOWN] Overall Load, Socket #0 (All must be [OK])
\_ [UNKNOWN] Overall Load: [Failed to parse metrics over time with -ThresholdInterval "5m": No data found matching the requested time index. Available indexes: []]

However, Use-Icinga; Invoke-IcingaCheckCPU; executed in a PS window returns meaningful values.

These are the versions involved:

agent        2.14.5    2.14.5
framework    1.13.3    1.13.3
plugins      1.13.1    1.13.1
service      1.3.0     1.3.0

Only the CPU check is affected, all other PS-Checks work fine.

Please advise.

geotekberlin avatar May 10 '25 22:05 geotekberlin

Hi @LordHepipud, Thanks for the afford here. I have to agree with @geotekberlin. The issue does still remain. At least in the way that we use the background daemon currently.

Local Execution of invoke-icingacheckcpu with thresholdinterval does return a correct result. However as stated, there seems to be a problem in combination with Icinga.

For full measure this is the relevant path we use in the command.

Exit-IcingaExecutePlugin -Command '''Invoke-IcingaCheckCPU''' ' '-DisableProcessList' '-Warning' '80%' '-Critical' '90%' '-Core' '_Total' '-Verbosity' '2' '-ThresholdInterval' '15m'

Which results in the Unknown State because the index is not found. The Background daemon is set up like this.

Register-IcingaBackgroundDaemon -Command 'Start-IcingaServiceCheckDaemon' Register-IcingaServiceCheck -CheckCommand 'Invoke-IcingaCheckCpu' -Interval 300 -TimeIndexes 5m, 15m

Maybe it's just a handling issue. If we should switch the way that the background daemon is initialized, you can let me know and I will try to test it out.

best regards, Brian

bfenda avatar May 13 '25 19:05 bfenda

With the new plugins and framework the error still persists:

[UNKNOWN] CPU Load [UNKNOWN] Overall Load, Socket #0 (All must be [OK])
\_ [UNKNOWN] Overall Load: [Failed to parse metrics over time with -ThresholdInterval "5m": No data found matching the requested time index. Available indexes: []]

However, Use-Icinga; Invoke-IcingaCheckCPU; executed in a PS window returns meaningful values.

These are the versions involved:

agent        2.14.5    2.14.5
framework    1.13.3    1.13.3
plugins      1.13.1    1.13.1
service      1.3.0     1.3.0

Only the CPU check is affected, all other PS-Checks work fine.

Please advise.

Thank you for the input. Could you please share the configuration used in the background daemon display in Show-Icinga please? How do you execute the plugin afterwards entirely?

So basically, the local execution is working fine running PowerShell but once it is being executed by Icinga itself the thresholds do not work? Are you using the REST-Api with the ifw-api feature enabled, allowing the Icinga Agent to directly talk to the Icinga for Windows API or how is the monitoring setup in your environments?

LordHepipud avatar May 14 '25 06:05 LordHepipud

Just out of curiosity - after adding a service check to the background daemon, did you run

Restart-IcingaForWindows

Otherwise, the configuration won't be applied. Depending on the system, it can take a while before all daemons are properly started. I did a few tests now and can confirm that checks are executed once the daemon is running and Icinga for Windows is fetching the data properly by using the API.

LordHepipud avatar May 14 '25 07:05 LordHepipud

Hello @LordHepipud, Not exactly the same cmdlet. Here is what we do in our case:

    Register-IcingaBackgroundDaemon -Command 'Start-IcingaServiceCheckDaemon'
    Register-IcingaServiceCheck -CheckCommand 'Invoke-IcingaCheckCpu' -Interval 300 -TimeIndexes 5m, 15m
    Disable-IcingaAgentFeature -Feature mainlog
    Disable-IcingaAgentFeature -Feature windowseventlog
    Install-IcingaComponent -Name 'plugins' -Confirm
    Enable-IcingaServiceRecovery
    Register-IcingaBackgroundDaemon -Command 'Start-IcingaWindowsRESTApi' -Arguments @{ '-Port' = 5667; }
    Add-IcingaRESTApiCommand -Command 'Invoke-IcingaCheck*' -Endpoint 'apichecks'
    Enable-IcingaFrameworkApiChecks
    Restart-IcingaService icinga2
    Restart-IcingaWindowsService

I also tried replacing restart-icingaWindowsService witch restart-IcingaForWindows but I see no difference in the behaviour.

bfenda avatar Jun 03 '25 11:06 bfenda

In general it looks fine. The only thing is that the CPU check will run every 5 minutes only.

But with the patch this shouldn't be an issue.

How did you updatebthd Framework to the latest version? Could you please try to update the cache and check afterwards, if the problem still persists?

icinga -RebuildCache { Restart-IcingaForWindows; }

Your command to restart Icinga for Windows is also fine, as it just is an alias.

LordHepipud avatar Jun 03 '25 12:06 LordHepipud

Is there any progress in resolving this issue? Updating Icinga Powershell or reinstalling from scratch breaks cpu-utilization-ps on each and every host, no matter which Windows Server or PC Version is used.

Component    Version   Available
---          ---       ---
agent        2.15.0    2.15.0
framework    1.13.3    1.13.3
plugins      1.13.1    1.13.1
service      1.3.0     1.3.0

Test-IcingaForWindows
[Notice]: Collecting Icinga for Windows environment information
[Passed]: The Icinga Agent service and the Icinga Agent are installed on the system
[Passed]: The Icinga for Windows service is installed on the system
[Passed]: The Icinga for Windows service binary does exist: "C:\Program Files\icinga-framework-service\icinga-service.exe"
[Passed]: Your service installation is not affected by IWKB000009
[Passed]: Your service installation is properly referring to "icinga-powershell-framework.psd1" for module imports.
[Passed]: The Icinga Agent service user "NT AUTHORITY\SYSTEM" is matching the Icinga for Windows service user "NT AUTHORITY\SYSTEM"
[Passed]: The specified user "NT AUTHORITY\SYSTEM" is allowed to run as service
[Passed]: Directory "C:\ProgramData\icinga2\etc" is fully accessible by "NT Authority\SYSTEM"
[Passed]: Directory "C:\ProgramData\icinga2\var" is fully accessible by "NT Authority\SYSTEM"
[Passed]: Directory "C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-framework\cache" is fully accessible by "NT Authority\SYSTEM"
[Passed]: Directory "C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-framework\config" is fully accessible by "NT Authority\SYSTEM"
[Passed]: Directory "C:\Program Files\WindowsPowerShell\Modules\icinga-powershell-framework\certificate" is fully accessible by "NT Authority\SYSTEM"
[Passed]: The Icinga Agent state file is healthy
[Passed]: Icinga Agent configuration is valid
[Passed]: Icinga Agent debug log is disabled
[Failed]: The Icinga for Windows REST-Api is not configured to start with the daemon
[Failed]: The Icinga for Windows REST-Api is not configured to allow API checks
[Passed]: The Icinga for Windows certificate is installed on the system and possibly signed by a valid CA
[Warning]: Icinga for Windows is configured without a JEA-Profile. It is highly recommended to use JEA for advanced security and easier permission handling
[Passed]: The Icinga for Windows service is running
[Failed]: The Icinga for Windows REST-Api responded with an error on this machine: "Die Verbindung mit dem Remoteserver kann nicht hergestellt werden."

Icinga2 V.2.15.0-1) (Debian)
Icingaweb2 V.2.12.4
Director V.1.11.4

geotekberlin avatar Jul 02 '25 08:07 geotekberlin

I just Updated Icinga Agent to 2.15.1 but the cpu-utilization check still doesn't work on any client in our environment. This is really frustrating.

geotekberlin avatar Oct 16 '25 17:10 geotekberlin

Hi @LordHepipud, I just double checked your changes. In my case the ForceExecution will never be true as the get-icingacachedata does not return anything.

Image

bfenda avatar Nov 10 '25 15:11 bfenda

Could you please check if #837 resolves this issue? I just had a customsr environment on which this fix resolved everything.

LordHepipud avatar Nov 14 '25 15:11 LordHepipud

Hi @LordHepipud, I just double checked your changes. In my case the ForceExecution will never be true as the get-icingacachedata does not return anything.

Image

I don't really get the problen here. The content is read from the metrics file in case it exist. If it exist, then data is read from there otherwise the value is empty and should trigger the force flag.

Also, where is Get-IcingaCacheData called in this case? It is only part of Read-IcingaCheckResultStore -CheckCommand $CheckCommand;, populating internal metrics. The force flag only applies if actual data could be read from disk.

LordHepipud avatar Nov 14 '25 15:11 LordHepipud