nginx-upsync-module icon indicating copy to clipboard operation
nginx-upsync-module copied to clipboard

health check index is timeout

Open arrow2012 opened this issue 5 years ago • 6 comments

curl "http://consul-nginx.test.in/v1/kv/upstreams/ai-serv?recurse" --http1.0

[{"LockIndex":0,"Key":"upstreams/ai- \serv/192.168.20.144:5555","Flags":0,"Value":"eyJ3ZWlnaHQiOjEsICJtYXhfZmFpbHMiOjIsICJmYWlsX3 RpbWVvdXQiOjEwfQ==","CreateIndex":570745,"ModifyIndex":570969}, {"LockIndex":0,"Key":"upstreams/ai- serv/192.168.20.144:5556","Flags":0,"Value":"eyJ3ZWlnaHQiOjEsICJtYXhfZmFpbHMiOjIsICJmYWlsX3R pbWVvdXQiOjEwfQ==","CreateIndex":570976,"ModifyIndex":570976}]

I use nginx proxy upstream consul server

upstream consul-server{
   server 192.168.20.103:8500;
}

server {
    #listen 8500;
    listen 80;
    server_name consul-nginx.test.in;
    access_log /data/logs/nginx/consul.log log_json;
    location / {
    proxy_connect_timeout 10s;
    proxy_pass http://consul-server;
    #proxy_set_header Connection "Keep-Alive";
    }
}

but the access log file /data/logs/nginx/error.log

*10188#10188: 6543164 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.20.110, server: consul-nginx.test.in, request: "GET /v1/kv/upstreams/ai-serv?recurse&index=570976 HTTP/1.0", upstream: "http://192.168.20.103:8500/v1/kv/upstreams/ai-serv?recurse&index=570976", host: "consul-nginx.gkid.in" 2019/04/27 15:54:14 [error] 10189#10189: upsync_consul_parse_init: recv upstream "ai-serv" error; http_status: 504

then I curl the api ,it is 504 curl "http://consul-nginx.test.in/v1/kv/upstreams/ai-serv?recurse&index=570976" --http1.0

504 Gateway Time-out

504 Gateway Time-out


openresty

arrow2012 avatar Apr 27 '19 07:04 arrow2012

firewall port 8500 on the consul server is probably not open to the nginx.

gfrankliu avatar Apr 27 '19 17:04 gfrankliu

no ,all consul server is ok

arrow2012 avatar Apr 29 '19 10:04 arrow2012

upsync module uses long pull against consul server. If there is no changes in consul from index 570976, it will "hold" the connection until there is a update in consul.

I see you have upsync_timeout=6m, and upsync module will wait 6m to timeout, but your nginx proxy_pass will timeout first at 1m (60s): http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_read_timeout You need to increase that if you want to use nginx in front of your consul servers.

gfrankliu avatar Apr 29 '19 16:04 gfrankliu

when I change the upsync_timeout=6m to upsync_timeout=10s ,the consul server con not be connect lsof | grep consul |wc -l 39071 there are too many connection to nginx TCP gk-app-prod01:8500->192.168.20.110:46070 (CLOSE_WAIT) consul 11756 11788 root 942u IPv6 329191708 0t0 TCP gk-app-prod01:8500->192.168.20.110:46084 (CLOSE_WAIT) consul 11756 11788 root 943u IPv6 329187297 0t0 TCP gk-app-prod01:8500->192.168.20.110:45922 (CLOSE_WAIT) consul 11756 11788 root 944u IPv6 329184241 0t0 TCP gk-app-prod01:8500->192.168.20.110:45848 (CLOSE_WAIT) consul 11756 11788 root 945u IPv6 329181173 0t0 TCP gk-app-prod01:8500->192.168.20.110:45866 (CLOSE_WAIT) consul 11756 11788 root 946u IPv6 329181139 0t0 TCP gk-app-prod01:8500->192.168.20.110:45858 (CLOSE_WAIT) consul 11756 11788 root 947u IPv6 329189784 0t0 TCP gk-app-prod01:8500->192.168.20.110:46184 (CLOSE_WAIT) consul 11756 11788 root 948u IPv6 329194574 0t0 TCP gk-app-prod01:8500->192.168.20.110:46074 (CLOSE_WAIT) consul 11756 11788 root 949u IPv6 329193587 0t0 TCP gk-app-prod01:8500->192.168.20.110:46224 (CLOSE_WAIT) consul 11756 11788 root 950u IPv6 329193588 0t0 TCP gk-app-prod01:8500->192.168.20.110:46226 (CLOSE_WAIT) consul 11756 11788 root 951u IPv6 329193474 0t0 TCP gk-app-prod01:8500->192.168.20.110:45880 (CLOSE_WAIT) consul 11756 11788 root 952u IPv6 329190665 0t0 TCP gk-app-prod01:8500->192.168.20.110:46086 (CLOSE_WAIT) consul 11756 11788 root 953u IPv6 329194510 0t0 TCP gk-app-prod01:8500->192.168.20.110:45988 (CLOSE_WAIT) consul 11756 11788 root 954u IPv6 329191714 0t0 TCP gk-app-prod01:8500->192.168.20.110:46110 (CLOSE_WAIT) consul 11756 11788 root 955u IPv6 329192512 0t0 TCP gk-app-prod01:8500->192.168.20.110:46012 (CLOSE_WAIT) consul 11756 11788 root 956u IPv6 329194668 0t0 TCP gk-app-prod01:8500->192.168.20.110:46204 (CLOSE_WAIT) consul 11756 11788 root 957u IPv6 329194552 0t0 TCP gk-app-prod01:8500->192.168.20.110:46044 (CLOSE_WAIT)

arrow2012 avatar Apr 30 '19 05:04 arrow2012

Is this AFTER you increased the nginx proxy_read_timeout? Since your upsync_timeout is 6m, the proxy_read_timeout needs to be bigger than that. The default proxy_read_timeout is only 1m (60s), so you will see those timeout errors in nginx log.

gfrankliu avatar Apr 30 '19 16:04 gfrankliu

upsync_timeout=10s is probably too aggressive and that will cause upsync module to make too many connections.

gfrankliu avatar Apr 30 '19 16:04 gfrankliu