apisix icon indicating copy to clipboard operation
apisix copied to clipboard

request help: the connection established by the Websocket proxy client will be automatically disconnected by the server in about 5 seconds

Open loongzh opened this issue 3 years ago • 26 comments

Issue description

apisix配置websocket代理后,客户端建立的连接会在5秒后自动断开 1)确保本地测试处于正常状态 2) APIX开启了WebSocket代理,连接收发消息正常 3)服务本身没有配置nginx代理,所以没有Websoket多层代理的情况

Environment

  • apisix version (cmd: apisix version):
  • OS (cmd: uname -a):
  • OpenResty / Nginx version (cmd: nginx -V or openresty -V):
  • etcd version, if have (cmd: run curl http://127.0.0.1:9090/v1/server_info to get the info from server-info API):
  • apisix-dashboard version, if have:
  • the plugin runner version, if the issue is about a plugin runner (cmd: depended on the kind of runner):
  • luarocks version, if the issue is about installation (cmd: luarocks --version):

loongzh avatar Nov 07 '21 09:11 loongzh

proviode the reproduction steps and env info

tzssangglass avatar Nov 08 '21 00:11 tzssangglass

apisix 版本 (cmd:) apisix version:2.6 操作系统(命令:)uname -a:Linux 58b4c36fe0ae 3.10.0-1160.11.1.el7.x86_64 #1 SMP Fri Dec 18 16:34:56 UTC 2020 x86_64 Linux. ----Docker容器环境

OpenResty / Nginx 版本(cmd:nginx -V或openresty -V):nginx version: openresty/1.19.3.1

etcd 版本,如果有(cmd:运行curl http://127.0.0.1:9090/v1/server_info以从服务器信息 API 获取信息): apisix-dashboard 版本,如果有: 插件运行器版本,如果问题与插件运行器有关(cmd:取决于运行器的类型): luarocks 版本,如果问题是关于安装 (cmd:) luarocks --version:/usr/local/openresty/luajit/bin/luarocks 3.7.0

loongzh avatar Nov 08 '21 02:11 loongzh

hi, @loongzh @tzssangglass means to provide specific configuration information and verification methods of APISIX so that we can reproduce this problem.

shuaijinchao avatar Nov 08 '21 09:11 shuaijinchao

hi, @loongzh @tzssangglass means to provide specific configuration information and verification methods of APISIX so that we can reproduce this problem.

  • up stream { "nodes": [ { "host": "172.19.0.1", "port": 8002, "weight": 1 } ], "timeout": { "connect": 6, "read": 6, "send": 6 }, "type": "roundrobin", "scheme": "http", "pass_host": "pass", "name": "lion-metro-server-file-manager", "desc": "文件管理服务" }

  • Router

    { "uris": [ "/api/filemanager/v1/*" ], "name": "lion-metro-server-file-manager-router", "upstream_id": "374894233917588162", "enable_websocket": true, "status": 1 }

Just turned on the WebSocket proxy

loongzh avatar Nov 08 '21 09:11 loongzh

@loongzh Could you check out the behavior of the backend server when the connection was aborted? We need to make sure whether the "disconnect" is made by the backend server or Apache APISIX.

tokers avatar Nov 08 '21 10:11 tokers

@loongzh Could you check out the behavior of the backend server when the connection was aborted? We need to make sure whether the "disconnect" is made by the backend server or Apache APISIX.

Local and service without using APISIX proxy, after many tests, there will be no disconnection

loongzh avatar Nov 08 '21 10:11 loongzh

@loongzh Could you check out the behavior of the backend server when the connection was aborted? We need to make sure whether the "disconnect" is made by the backend server or Apache APISIX.

Local and service without using APISIX proxy, after many tests, there will be no disconnection

I mean, are there any logs printed in the backend server? When the connection is aborted? We may get some clues from the backend server.

tokers avatar Nov 08 '21 10:11 tokers

  • "timeout": { "connect": 6, "read": 6, "send": 6 },

if change 6 to 10, what will happen?

tzssangglass avatar Nov 08 '21 14:11 tzssangglass

  • "timeout": { "connect": 6, "read": 6, "send": 6 },

if change 6 to 10, what will happen?

"timeout": { "connect": 16, "read": 16, "send": 16 } Will be disconnected after 16s,Can be set up to 5 minutes after testing.

loongzh avatar Nov 09 '21 02:11 loongzh

What happens if don't set the timeout?

tzssangglass avatar Nov 09 '21 06:11 tzssangglass

What happens if don't set the timeout?

It will disconnect after 1 minute

loongzh avatar Nov 09 '21 10:11 loongzh

It's strange that the WebSocket connection will be subjected to the timeout settings, since after the connection was "upgraded" tot WebSocket, all timers on it were removed, the connection was aborted only if the upstream or downstream close it.

tokers avatar Nov 09 '21 11:11 tokers

@loongzh Could you try to configure the backend or the client, so that Ping frame can be sent periodically.

tokers avatar Nov 09 '21 11:11 tokers

@loongzh Could you try to configure the backend or the client, so that Ping frame can be sent periodically.

And then let's check out whether the connection will be aborted.

tokers avatar Nov 09 '21 11:11 tokers

@loongzh Could you try to configure the backend or the client, so that Ping frame can be sent periodically.

The front-end monitors the disconnection event to reconnect, and sends a message to the server to ensure the normal use of WebSocket

loongzh avatar Nov 09 '21 15:11 loongzh

@loongzh @tokers @shuaijinchao after testing, I have some conclusions:

  1. the timeout configuration affect websocket connecrtion, the connection will be closed by APISIX if the proxied server does not transmit any data within timeout, which is expect behavior.
  2. client needs to actively maintain a heartbeat with the server to prevent disconnection

also see: https://nginx.org/en/docs/http/websocket.html?_ga=2.219808217.1349535884.1636598551-1257987705.1633943387

tzssangglass avatar Nov 11 '21 02:11 tzssangglass

The problem has been solved by the new method of heartbeat, but it is not an ideal way. Directly using the Nginx proxy will not cause the service to actively disconnect the client connection.

loongzh avatar Nov 19 '21 06:11 loongzh

@loongzh @tokers @shuaijinchao after testing, I have some conclusions:

  1. the timeout configuration affect websocket connecrtion, the connection will be closed by APISIX if the proxied server does not transmit any data within timeout, which is expect behavior.
  2. client needs to actively maintain a heartbeat with the server to prevent disconnection

also see: https://nginx.org/en/docs/http/websocket.html?_ga=2.219808217.1349535884.1636598551-1257987705.1633943387

@tzssangglass Can we reset the timeout settings once the connection is upgraded to websocket?

tokers avatar Nov 19 '21 09:11 tokers

Directly using the Nginx proxy will not cause the service to actively disconnect the client connection.

Not true, see: https://nginx.org/en/docs/http/websocket.html?_ga=2.219808217.1349535884.1636598551-1257987705.1633943387

By default, the connection will be closed if the proxied server does not transmit any data within 60 seconds. This timeout can be increased with the proxy_read_timeout directive. Alternatively, the proxied server can be configured to periodically send WebSocket ping frames to reset the timeout and check if the connection is still alive.

@tokers

Can we reset the timeout settings once the connection is upgraded to websocket?

maybe we could enhance this, but I think that the client still has to keep heartbeat with server on websocket? otherwise the connection would be stop after the timeout period.

tzssangglass avatar Nov 20 '21 05:11 tzssangglass

Directly using the Nginx proxy will not cause the service to actively disconnect the client connection.

Not true, see: https://nginx.org/en/docs/http/websocket.html?_ga=2.219808217.1349535884.1636598551-1257987705.1633943387


By default, the connection will be closed if the proxied server does not transmit any data within 60 seconds. This timeout can be increased with the proxy_read_timeout directive. Alternatively, the proxied server can be configured to periodically send WebSocket ping frames to reset the timeout and check if the connection is still alive.

@tokers

Can we reset the timeout settings once the connection is upgraded to websocket?

maybe we could enhance this, but I think that the client still has to keep heartbeat with server on websocket?

otherwise the connection would be stop after the timeout period.

Frankly, the websocket proxy behavior of APISIX and Nginx should be same.

tokers avatar Nov 20 '21 06:11 tokers

坦白说,APISIX 和 Nginx 的 websocket 代理行为应该是一样的。

In fact, I think the websocket proxy behavior of APISIX is same as Nginx.

the connection will be closed if the proxied server does not transmit any data within 60 seconds in Nginx.

here is my step:

  1. start a websocket backend:
websocat -s 1234
  1. start openresty, here is my config:
http {

    server {
        listen 1980;

        access_log off;
        location / {
            proxy_pass http://127.0.0.1:1234;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
        }
    }
}
  1. start websocker client, connect to openresty
websocat ws://127.0.0.1:1980/

After 60s of not sending a message, the connection is broken.

tzssangglass avatar Nov 21 '21 16:11 tzssangglass

坦白说,APISIX 和 Nginx 的 websocket 代理行为应该是一样的。

In fact, I think the websocket proxy behavior of APISIX is same as Nginx.

the connection will be closed if the proxied server does not transmit any data within 60 seconds in Nginx.

here is my step:

  1. start a websocket backend:
websocat -s 1234
  1. start openresty, here is my config:
http {

    server {
        listen 1980;

        access_log off;
        location / {
            proxy_pass http://127.0.0.1:1234;
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
        }
    }
}
  1. start websocker client, connect to openresty
websocat ws://127.0.0.1:1980/

After 60s of not sending a message, the connection is broken.

That makes sense. @loongzh I checked out the related codes in Nginx, it's an expected behavior. If the proxied server gives up the timeout settings, it might be overwhelmed by a lot of WebSocket connections, from it's own point of view, this is necessary.

tokers avatar Nov 22 '21 01:11 tokers

Hi, I had the same issue.

For me the connection was being reset because the 6s timeout (send/receive) was too fast for the heartbeat.

Socket.io (the library i am using) send heartbeats every 25s so increasing send/recieve timeout to 30s did the trick.

https://stackoverflow.com/questions/12815231/controlling-the-heartbeat-timeout-from-the-client-in-socket-io

mjsmagalhaes avatar Nov 22 '21 17:11 mjsmagalhaes

Hi, I had the same issue.

For me the connection was being reset because the 6s timeout (send/receive) was too fast for the heartbeat.

Socket.io (the library i am using) send heartbeats every 25s so increasing send/recieve timeout to 30s did the trick.

https://stackoverflow.com/questions/12815231/controlling-the-heartbeat-timeout-from-the-client-in-socket-io

If we don't have a better solution now, we may document this.

tokers avatar Nov 23 '21 00:11 tokers

Hi, I had the same issue.

For me the connection was being reset because the 6s timeout (send/receive) was too fast for the heartbeat.

Socket.io (the library i am using) send heartbeats every 25s so increasing send/recieve timeout to 30s did the trick.

stackoverflow.com/questions/12815231/controlling-the-heartbeat-timeout-from-the-client-in-socket-io we can set timeout in upstream to large?

tzssangglass avatar Nov 23 '21 01:11 tzssangglass

I encountered the same problem, and last i found it caused by TIMEOUT setting ,when the websocket backend heartbeat interval is every 55 second ,but the apisix default upstream timeout time is 6 sedonds, so modify the upstream timeout bigger than the backend heartbeart interval time, it worked.

iyzrj avatar Sep 07 '22 07:09 iyzrj

This issue has been marked as stale due to 350 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Aug 23 '23 10:08 github-actions[bot]

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

github-actions[bot] avatar Sep 06 '23 10:09 github-actions[bot]