chocolatey-licensed-issues icon indicating copy to clipboard operation
chocolatey-licensed-issues copied to clipboard

Chocolatey Agent - Provide ability to control number of retry attempts, or total duration for retry, to connect with CCM

Open gep13 opened this issue 4 years ago • 7 comments

When first attempting to communicate with the Chocolatey Central Management Service, the Chocolatey Agent will use the configured URL in the chocolatey.config file, and attempt to perform a TLS handshake with the service. if this can't be performed, it will be retried. However, after a number of unsuccessful attempts, it will stop trying, and write a FATAL message into the Chocolatey Agent log. This will not stop the Chocolatey Agent service, since there are other portions of the Chocolatey Agent (for example background service) which can run without communication to CCM being established. IN addition, it is possible that the service URL in the chocolatey.config file is incorrect, so continually re-attempting to connect isn't ideal.

In environments where VPN's are being used, or where network connections are intermittent, it is possible that an initial connection attempt to CCM is not successful, but later everything that is required to communicate with CCM is in place, however at this point a restart of the Chocolatey Agent service is required in order to force that re-connection.

It would be good to do one, or more, of the following things:

  1. Provide a configuration value to control the number of initial retry attempts that are made, with a configurable back-off between each attempt
  2. Provide a configuration value to control a period of time in which Chocolatey Agent should continue to try to make a connection with CCM
  3. Provide the ability to continue to attempt re-connection based on an interval, which obvious stops once the connection is made successfully

┆Issue is synchronized with this Gitlab issue by Unito

gep13 avatar Apr 30 '21 13:04 gep13

ZenDesk ticket requesting this enhancement. Also looking through the agent-log data on the ticket. The entire retry process only lasts around 5 seconds from first try to last and stopping of the service.

ryanrichter94 avatar Jul 01 '22 13:07 ryanrichter94

Bumping with this Zendesk ticket experiencing the same issue.

imm0rtalsupp0rt avatar Jan 24 '23 15:01 imm0rtalsupp0rt

I'm seeing this as well and just got a reply from support pointing me back here.

I'm not sure 100% which version of .NET the agent targets, but .NET 4 (and above, possibly even below) does have a native hook for network interface changes. It could be worth resetting that connection timeout counter (or whatever the internal code is that stops the constant retries) once a network interface change is observed: https://learn.microsoft.com/en-us/dotnet/api/system.net.networkinformation.networkchange?view=netframework-4.0

jma89 avatar Mar 09 '23 16:03 jma89

We are using chocolatey in an environment with necessary VPN connection, this issue causes the agent to be unresponsive which in turn gives us no way to make sure there is an active connection between a remote client and our CCM server.

  • which prevents us from running any deployment on the remote machine
  • get current information from the remote machine

Workarounds like establishing windows tasks to restart the chocolatey agent in static intervals is sub optimal, because we can never be certain the user is running the required VPN connection prior to those restarts.

This ticket is now open for over 3 years and it seems the requirement for running this agent on your OWN operating system without creating these unintended situations is too much to ask?

sascha-wi avatar Dec 13 '23 11:12 sascha-wi

There are also other possible situations:

  1. the CCM certificate is wrongly setup resulting in failed comms for entire agent fleet.
  2. the CCM lost network connectivity.

Imagine having to restart the agent on over 1000 servers after fixing the issue. If the intention is to provide an enterprise grade product, the Chocolatey team should prioritise this issue. It was mentioned by gep13 that continually attempting to reconnect isn't ideal - perhaps the agent could back off after 3 tries and try to connect after a longer wait for subsequent tries.

kfng1 avatar Jun 06 '24 21:06 kfng1

This is happening with my end devices, please can this issue be fixed.

godigital1995 avatar Jul 17 '24 12:07 godigital1995

@godigital1995 We did investigate this very recently with the goal of adding this. We discovered that this is a major change to Chocolatey Agent. We have prioritised to work on it, but I haven't got a date I can provide you with.

pauby avatar Jul 17 '24 13:07 pauby