consul-template icon indicating copy to clipboard operation
consul-template copied to clipboard

Consul-template removing few services randomly...

Open santhoshp87 opened this issue 5 years ago • 4 comments

Hi, We have multiple services registered to consul. Consul template running on Nginx machine is responsible for reading the consul services and updating the Nginx configuration, whenever a new server added or deleted. Found a bug in the consul-template in v0.16 that removes few services automatically from Nginx configuration and reloads it., nothing in the logs. Now Upgraded the Consul template to v0.24 still seeing same issue. Did anyone saw this issue? Am i doing something wrong?

Consul Template version

consul-template -v

consul-template v0.24.1 (c54d0abc)

Expected behavior

Consul-template restart should pull all services from consul and update the Nginx

Actual behavior

Consul-template restart is removing few services during restart , this is not happening everytime. Happens once in a while. Another restart will again fix everything.

Steps to reproduce

  1. Few services are not showing up in Nginx ,because consul-template is removing these services randomly.
  2. This is not happening every time and nothing i could find in the logs.

santhoshp87 avatar Sep 22 '20 21:09 santhoshp87

Hey @santhoshp87, thanks for taking the time to report this.

I haven't seen this behavior, so if you could come up with a way to replicate it that would be a big help.

One thought. Do the times it does this coincide with restarting the local consul agent or the consul servers? I have seen the case where consul responds with partial data while spinning up and, while there is a workaround to be sure this doesn't happen, consul-template doesn't do it at the moment.

eikenb avatar Sep 22 '20 22:09 eikenb

Hi @eikenb , Sorry i was wrong when i said its happening only after consul-template restart, Its can happen anytime even without restart. As a workaround, In order to avoid removal of services by consul-template from Nginx configuration, we have put a checks in consul script that will look for service removals and if they are more than 20 services , it will stop removal of the services. It doesn't coincide with consul or consul restarts.

I am updating the "steps to reproduce"

santhoshp87 avatar Sep 22 '20 22:09 santhoshp87

Thanks for the feedback, though the Steps To Reproduce is best as a simple setup where I could reproduce it locally. With the amount of information I have currently, there is not much I can do on this except keep an eye out for it (which I'll do).

Even if you can't create a simple way to reproduce the issue, it would help if you could supply more of the information you do have. The template and config would be a good start. And if you can capture it happening in the logs (the higher the log level the better) that might also help.

Thanks.

eikenb avatar Oct 02 '20 20:10 eikenb

Thanks @eikenb for followup, Here are the configurations in our environment. api-gateway.tar.gz The content under default_svc_metadata.json file is something like : "health_check" : { "interval" : "2s", "fails" : 1, "passes" : 1, "uri" : "/health" } }

This issue can happen randomly, its not consistent.

santhoshp87 avatar Oct 13 '20 23:10 santhoshp87