spring-cloud-config icon indicating copy to clipboard operation
spring-cloud-config copied to clipboard

spring-cloud-conig-server got blocked during GIT call

Open maecval opened this issue 3 years ago • 4 comments

Describe the bug In our spring-boot based environment we observed that all calls to the config-server got blocked for minutes during an outage of our GIT infrastructure. It seemed to have been a very specific error which might be network related. Unfortunately we were not yet able to reproduce the issue. If the network is disabled the config-server works as expected and will use the locally cached/stored GIT files. But in that specific case none of the config-clients were able to retrieve configuration data from the config-server which meant in our case that a restart of any application instance would have failed during this period. There are dozens of applications depending on this config-server.

Sample As explained we were not (yet) able to reproduce the error conditions. We think it might have been a very rare case of a network problem.

Proposal Invoke calls to the remote GIT system in separate threads. A possible solution for this with the

Kind regards, Valentin

maecval avatar Sep 14 '21 09:09 maecval

config clients done necessarily need to fail if they can't reach the config server, so that should not be a concern.

If you have improvements that you would like to contribute to the project, please submit a PR.

ryanjbaxter avatar Sep 14 '21 16:09 ryanjbaxter

Hi Ryan, Thanks for your reply. Maybe it wasn't explained clear enough: it was the config-server's calls to the GIT backend (via JGit lib) which got stuck and blocked for dozens of minutes. So it makes also sense to make the config-server more robust in this area. The config-client were able to reach the config-server. But that got blocked, i.e. they just did not get a response. (BTW it was also the config-client-health-check which got blocked too).

Unfortunately I couldn't yet figure out how to create a PR (or even a branch).... Will try again.

maecval avatar Sep 14 '21 17:09 maecval

I understand the problem. On the client side you should be able to configure the timeout, so if it takes too long it will timeout and not hang.

This might help https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request

ryanjbaxter avatar Sep 14 '21 18:09 ryanjbaxter

Thanks for the information. Making the client to timeout is no option because the Git-integration in the config-server actually supports a local cache which should allow to always get a response. If clients times out it would not be to startup (in our case).

Anyway I will try to create a PR.

Have a good time!

maecval avatar Sep 15 '21 17:09 maecval

I had a simular problem and was able to reproduce it with following steps:

  1. Start cloudconfigserver
  2. Edit /etc/hosts on the host running cloud config server to route the hostname of the git-server to 127.0.0.1
  3. Open a port with nc -l <git-port>
  4. Try to load properties through the http api from cloud config and wait forever...

marbon87 avatar Apr 26 '23 12:04 marbon87

Why open the port?

ryanjbaxter avatar May 09 '23 00:05 ryanjbaxter

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

spring-cloud-issues avatar May 16 '23 00:05 spring-cloud-issues

Thanks for the reminder. I plan to create a branch and a PR which will contain the solution we are currently using. It is a patched version of config-server where the calls to the GIT-server get executed in a separate thread (i.e. async). So I'd be happy if we could keep this open for some more time... Cheers !

maecval avatar May 16 '23 06:05 maecval

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

spring-cloud-issues avatar May 23 '23 09:05 spring-cloud-issues

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open the issue.

spring-cloud-issues avatar May 30 '23 23:05 spring-cloud-issues