Since upgrading icinga-powershell-framework to 1.12.3, timeouts occur more often
We With CVE-2024-49369, we have updated all window servers to the latest icinga-powershell-framework (1.12.3) and Icinga Client (2.14.3). Now we notice that different servers get a timeout every now and then during the simple check "Invoke-IcingaCheckTimeSync".
We tested 2 different servers (Windows 2019 with the same function).
For one of them, the check runs through within 2-3 seconds and for the other, the check takes at least 20 seconds until the timeouts (30 seconds).
This is what the event log says:
warning/Process: Killing process group 6444 ('C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NoLogo -ExecutionPolicy ByPass -C "try { Use-Icinga -Minimal; } catch { Write-Output 'The Icinga PowerShell Framework is either not installed on the system or not configured properly. Please check https://icinga.com/docs/windows for further details'; Write-Output 'Error:' $($_.Exception.Message)Components:rn$( Get-Module -ListAvailable 'icinga-powershell-*' )rn'Module-Path:'rn$($Env:PSModulePath); exit 3; }; Exit-IcingaExecutePlugin -Command 'Invoke-IcingaCheckTimeSync' " -Server '***.***.***.***' -Warning -1s:1s -Critical -2s:2s -Verbosity 2') after timeout of 30 seconds
Before the update, these problems never existed. Now I hope that you can find something, because we don't have a approach.
Thank you
Greeting
Sascha
Hi together,
I would also add to it. Since updating to 2.14.3 especially we see various checks on Icinga Agents, frequently run into timeouts. The Timeout is set to 60 seconds. Normally the checks takes about 10s, but as said, the check runs into a timeout very often leading to the service being in a flapping state:
I maybe found a related message to it in the windows Eventlog.
Thank you,
Best regards, Brian
We're encountering this issue across multiple production environments affecting several of our customers since upgrading IFW. Given the widespread impact on our production systems, we would greatly appreciate any insights or potential solutions. We're happy to provide additional details or assist with debugging if needed. Thank you for looking into this!
Since the update of icinga-powershell-framework to 1.12.3 we observe the same problem! The checks run into timeouts. Can you please take a look at this?
Ever since updating the Icinga PowerShell Framework to version 1.12.3, we've been encountering the same issue—our checks are timing out
We are also encountering this issue, lots of timeouts since we updated the icinga powershell framework. The only viable fix/workaround currently is a rollback to the previous version, which we definitely would like to avoid. @LordHepipud are there any potential other fixes or workarounds which we could at least manually apply?
Same here! Might be coincidentally, but at least in our environment only two machines are affected, both running Server 2019 with domain controller services (one host physical and the other virtual on vSphere).
Thank you for all the reports. I have already tried to take a look on this case and can't really reproduce the issue. Based on the provided issues, all of these events happend after upgrading to v1.12.3 - is this correct?
I assume on all those machines the REST-Api of Icinga for Windows is being used (as mentioned in the logs).
To me it seems weird, that the internal threads are being terminated because an timeout. This would in general mean, the thread hung for more than 3 minutes (the internal Icinga for Windows threshold for determining if a thread is still working or frozen).
I had a similar occurrence on a customer environment this week, while here no packets were transmitted for the Icinga for Windows socket. Instead of the thread being killed there, the socket reader terminated the connection after 5 seconds because no packets were send.
I'm not sure if these errors are related, but something strange seems to be going on.
Just another question: Which version of Icinga for Windows was working properly previously?
The only real change we made with v1.12 was ensuring that network data is read properly to the end in call cases:
https://github.com/Icinga/icinga-powershell-framework/pull/706
The other remaining topics for certificate handling can be ignored I assume, as we already establish successful connections.
Can someone please try to revert the handling of the provided PR #706 on an affected machine and check the results?
For us, it was definitely different versions smaller than 1.12.3. In some cases, the client had to be completely reinstalled. As I wrote above, this only occurred after we had brought everything up to date through the CERT. Downgrading Powershell didn't have any effect here.
Hello, a little more info from us. All servers we have problems with are Domain Controller and the last known working version we use for the powershell-plugins is 1.12.0, the Agent Version 2.14.3 and powershell-framework 1.12.3. The Problem does not manifest immediately. We first tried to completely reinstall the agent on the servers and everything was working fine for around 6-8 Hours, after that the problems started again.
Today we installed the new version v1.13.0, which where announced (https://icinga.com/blog/releasing-icinga-for-windows-v1-13-0/). It seems that the version fixed the timeout problem.
Is this issue still on-going or is it resolved for everyone by updating to the latest Icinga for Windows version?
@LordHepipud I will update to 1.13.3 once it is released. Based on the milestone it shouldn't be to far off. I will give you feedback once I released it in the pipeline.
Thank you
Currently running 1.13.2 without a problem. No timeouts since upgrading to 1.13.0.
From my site everything is working like before. Thanks for fixing.