nginx-stream-upsync-module icon indicating copy to clipboard operation
nginx-stream-upsync-module copied to clipboard

tcp模块编译成功,启动后报错:upsync_del_peer

Open yxs0201 opened this issue 6 years ago • 18 comments

编译成功,编译信息:

./configure --with-stream --add-module=/opt/nginx-stream-upsync-module-1.2.0
make -j 4
make install

Nginx报错信息:

cat error.log
2018/12/19 15:09:41 [notice] 22506#0: using the "epoll" event method
2018/12/19 15:09:41 [notice] 22506#0: nginx/1.12.2
2018/12/19 15:09:41 [notice] 22506#0: built by gcc 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)
2018/12/19 15:09:41 [notice] 22506#0: OS: Linux 4.15.0-42-generic
2018/12/19 15:09:41 [notice] 22506#0: getrlimit(RLIMIT_NOFILE): 1024:1048576
2018/12/19 15:09:41 [notice] 22507#0: start worker processes
2018/12/19 15:09:41 [notice] 22507#0: start worker process 22508
2018/12/19 15:09:42 [error] 22508#0: upsync_del_peer: upstream "testTcp" cannot delete all peers
2018/12/19 15:09:42 [error] 22508#0: upsync_process: upstream del peers failed

试过Nginx其他版本也一样报错。

yxs0201 avatar Dec 19 '18 07:12 yxs0201

把你的配置文件发一下?

xiaokai-wang avatar Dec 19 '18 09:12 xiaokai-wang

nginx.conf

worker_processes  1;

error_log  /usr/local/nginx/logs/error.log debug;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;
    sendfile        on;
    keepalive_timeout  65;

    server {
        listen       80;
        server_name  localhost;

        location / {
            root   html;
            index  index.html index.htm;
        }

        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }

    }
   include conf.d/*.conf;
}

stream {
   include tcp/*.conf;
}

tcp.conf

upstream testTcp {
    upsync consul.nginx.xxx.com/v1/kv/upstreams/test/ upsync_timeout=6m upsync_interval=500ms upsync_type=consul strong_dependency=off;
    upsync_dump_path /usr/local/nginx/conf/servers/servers_test_tcp.conf;
    include /usr/local/nginx/conf/servers/servers_test_tcp.conf;
}

server {
    error_log    /usr/local/nginx/logs/tcp.error.log debug;
    listen 12345;
    proxy_connect_timeout 1s;
    proxy_timeout 3s;
    proxy_pass testTcp;
    proxy_next_upstream on;
}

servers_test_tcp.conf

cat servers_test_tcp.conf
server 127.0.0.1:80 down;

curl请求

curl consul.nginx.xxx.com/v1/kv/upstreams/test?keys
["upstreams/test/10.1.1.1:51668"]

yxs0201 avatar Dec 20 '18 05:12 yxs0201

What's output of curl consul.nginx.xxx.com/v1/kv/upstreams/test/10.1.1.1:51668?raw

gfrankliu avatar Dec 20 '18 07:12 gfrankliu

请确认nginx用户可以改写文件servers_test_tcp.conf

gfrankliu avatar Dec 20 '18 07:12 gfrankliu

What's output of curl consul.nginx.xxx.com/v1/kv/upstreams/test/10.1.1.1:51668?raw

# 带参数raw执行,无值返回
curl consul.nginx.xxx.com/v1/kv/upstreams/test/10.1.1.1:51668?raw
# 无参数raw执行,正常返回
curl consul.nginx.xxx.com/v1/kv/upstreams/test/10.1.1.1:51668
[{"LockIndex":0,"Key":"upstreams/test/10.1.1.1:51668","Flags":0,"Value":null,"CreateIndex":40576,"ModifyIndex":40576}]

yxs0201 avatar Dec 20 '18 09:12 yxs0201

请确认nginx用户可以改写文件servers_test_tcp.conf

确认有权限,如下文日志信息所示,http能正常dump:

2018/12/20 17:09:01 [notice] 25065#0: upsync_dump_server: dump conf file /usr/local/nginx/conf/servers/servers_test.conf succeeded, number of servers is 2
2018/12/20 17:09:01 [error] 25065#0: upsync_del_peer: upstream "testTcp" cannot delete all peers
2018/12/20 17:09:01 [error] 25065#0: upsync_process: upstream del peers failed
2018/12/20 17:09:01 [error] 25064#0: upsync_del_peer: upstream "testTcp" cannot delete all peers
2018/12/20 17:09:01 [error] 25064#0: upsync_process: upstream del peers failed

文件权限:

root@bbdops:/usr/local/nginx/conf/servers# ls -l
-rw-r--rw- 1 nobody root  128 12月 20 17:09 servers_test.conf
-rw-r--rw- 1 nobody root    0 12月 20 15:56 servers_test_tcp.conf

Nginx服务运行用户

root@bbdops:/usr/local/nginx/conf/servers# ps -ef | grep nginx
root     25061  2384  0 17:08 ?        00:00:00 nginx: master process nginx
nobody   25062 25061  0 17:08 ?        00:00:00 nginx: worker process
nobody   25063 25061  0 17:08 ?        00:00:00 nginx: worker process
nobody   25064 25061  0 17:08 ?        00:00:00 nginx: worker process
nobody   25065 25061  0 17:08 ?        00:00:00 nginx: worker process

yxs0201 avatar Dec 20 '18 09:12 yxs0201

Nginx版本从1.12.2升级到1.14.2,也一样的报错。

yxs0201 avatar Dec 20 '18 09:12 yxs0201

你的配置里没有提到servers_test.conf, 但日志里却有dump conf file /usr/local/nginx/conf/servers/servers_test.conf succeeded 你到底是用servers_test.conf还是servers_test_tcp.conf?

gfrankliu avatar Dec 20 '18 18:12 gfrankliu

servers_test.conf dump成功,这个是http模块的,http模块正常。servers_test_tcp.conf是TCP模块的,tcp模块就报错:upsync_del_peer: upstream "testTcp" cannot delete all peers

yxs0201 avatar Dec 21 '18 01:12 yxs0201

可能你运行了另外的nginx。 根据你的编译信息,你只加了nginx-stream-upsync-module。另外,你贴出的配置也没提到http upsync,怎么会dump出http模块的servers_test.conf呢?

gfrankliu avatar Dec 22 '18 21:12 gfrankliu

你提到:

文件权限:

root@bbdops:/usr/local/nginx/conf/servers# ls -l
-rw-r--rw- 1 nobody root  128 12月 20 17:09 servers_test.conf
-rw-r--rw- 1 nobody root    0 12月 20 15:56 servers_test_tcp.conf

显示servers_test_tcp.conf是0字节。但前面你又说:

cat servers_test_tcp.conf
server 127.0.0.1:80 down;

显然不是0字节。

gfrankliu avatar Dec 22 '18 21:12 gfrankliu

确实有过变更,之前的编译信息只是tcp,后面为证明http是可用的,我加过http的编译及相关配置。并且尝试更换过Nginx版本,系统版本。通过更换环境的方式验证tcp是否可用(都报相同的错:upsync_del_peer)。建议不要纠结这个问题,是否有实际可指导的解决方案用于定位tcp报错的根本原因。

yxs0201 avatar Dec 24 '18 04:12 yxs0201

I tried your configs and didn't see the error. Please send the steps to reproduce the error on a clean OS, consul and nginx installation.

gfrankliu avatar Dec 31 '18 21:12 gfrankliu

Someone else at https://github.com/weibocom/nginx-upsync-module/issues/240 had the same upsync_del_peer error. It turns out to be a wrong configuration.

gfrankliu avatar Jan 09 '19 18:01 gfrankliu

我也遇到一样的问题

krainz avatar Jul 11 '19 05:07 krainz

我也遇到了同样的问题,请问有人知道解决方法吗?

ZLget avatar Nov 13 '19 07:11 ZLget

我也遇到了同样的问题,请问有人知道解决方法吗?

我刚刚不知道做了什么,他突然又能正常用了。我觉得唯一的可能就是在第一次启动nginx之前,要保证servers_test_tcp.conf中的内容与consul中相同文件夹的内容一致。

ZLget avatar Nov 13 '19 08:11 ZLget

我也遇到同样的问题,解决办法就是在第一次启动之前,在servers_test_tcp.conf中的server配置要与consul中一致 例如:consul中 curl consul.nginx.xxx.com/v1/kv/upstreams/test/10.1.1.1:51668 [{"LockIndex":0,"Key":"upstreams/test/10.1.1.1:51668","Flags":0,"Value":null,"CreateIndex":40576,"ModifyIndex":40576}]

servers_test_tcp.conf中应该提前写入: server 10.1.1.1:51668;

wangwang109 avatar Mar 03 '21 02:03 wangwang109