consul-template
consul-template copied to clipboard
Consul-template removing few services randomly...
Hi, We have multiple services registered to consul. Consul template running on Nginx machine is responsible for reading the consul services and updating the Nginx configuration, whenever a new server added or deleted. Found a bug in the consul-template in v0.16 that removes few services automatically from Nginx configuration and reloads it., nothing in the logs. Now Upgraded the Consul template to v0.24 still seeing same issue. Did anyone saw this issue? Am i doing something wrong?
Consul Template version
consul-template -v
consul-template v0.24.1 (c54d0abc)
Expected behavior
Consul-template restart should pull all services from consul and update the Nginx
Actual behavior
Consul-template restart is removing few services during restart , this is not happening everytime. Happens once in a while. Another restart will again fix everything.
Steps to reproduce
- Few services are not showing up in Nginx ,because consul-template is removing these services randomly.
- This is not happening every time and nothing i could find in the logs.
Hey @santhoshp87, thanks for taking the time to report this.
I haven't seen this behavior, so if you could come up with a way to replicate it that would be a big help.
One thought. Do the times it does this coincide with restarting the local consul agent or the consul servers? I have seen the case where consul responds with partial data while spinning up and, while there is a workaround to be sure this doesn't happen, consul-template doesn't do it at the moment.
Hi @eikenb , Sorry i was wrong when i said its happening only after consul-template restart, Its can happen anytime even without restart. As a workaround, In order to avoid removal of services by consul-template from Nginx configuration, we have put a checks in consul script that will look for service removals and if they are more than 20 services , it will stop removal of the services. It doesn't coincide with consul or consul restarts.
I am updating the "steps to reproduce"
Thanks for the feedback, though the Steps To Reproduce is best as a simple setup where I could reproduce it locally. With the amount of information I have currently, there is not much I can do on this except keep an eye out for it (which I'll do).
Even if you can't create a simple way to reproduce the issue, it would help if you could supply more of the information you do have. The template and config would be a good start. And if you can capture it happening in the logs (the higher the log level the better) that might also help.
Thanks.
Thanks @eikenb for followup, Here are the configurations in our environment. api-gateway.tar.gz The content under default_svc_metadata.json file is something like : "health_check" : { "interval" : "2s", "fails" : 1, "passes" : 1, "uri" : "/health" } }
This issue can happen randomly, its not consistent.