apisix icon indicating copy to clipboard operation
apisix copied to clipboard

help request: all route failed at the same time and then not reload config

Open oneto1 opened this issue 2 years ago • 7 comments

Description

I have a 12 nodes apisix(docker install) cluster with 3 nodes etcd cluster .

Before route failed I use apisix-dashbroad to modify same route config about 16:00 - 17:00 .

At 17:30 all route failed (404) When problem happened . I tried to fix route but route's num is big so i give up and rollback to nginx after 17:34. image

Apisix have no warn or error log , etcd too.

Jul 25 17:30:09 apisix-host journal: a.b.c.d - - [25/Jul/2022:09:30:06 +0000] xxxxx.com "GET / HTTP/2.0" 200 214 0.002 "-" "Go-http-client/2.0" 127.0.0.1:7480 200 0.002 "http://xxxxx.com" Jul 25 17:30:11 apisix-host journal: a.b.c.d - - [25/Jul/2022:09:30:08 +0000] xxxxx.com "PUT /prom/310132_test.txt HTTP/1.1" 200 25 0.014 "-" "aws-sdk-cpp/1.8.95/S3/Linux/3.10.0-693.21.1.el7.x86_64 x86_64 GCC/8.3.1" 127.0.0.1:7480 200 0.010 "http://xxxxx.com" Jul 25 17:30:11 apisix-host journal: a.b.c.d - - [25/Jul/2022:09:30:09 +0000] xxxxx.com "GET /prom/310132_test.txt HTTP/1.1" 404 47 0.000 "-" "aws-sdk-cpp/1.8.95/S3/Linux/3.10.0-693.21.1.el7.x86_64 x86_64 GCC/8.3.1" - - - "http://xxxxx.com" Jul 25 17:30:19 apisix-host journal: a.b.c.d - - [25/Jul/2022:09:30:16 +0000] xxxxx.com "PUT /prom HTTP/1.1" 404 47 0.000 "-" "aws-sdk-cpp/1.8.95/S3/Linux/3.10.0-693.21.1.el7.x86_64 x86_64 GCC/8.3.1" - - - "http://xxxxx.com" Jul 25 17:30:19 apisix-host journal: a.b.c.d - - [25/Jul/2022:09:30:16 +0000] xxxxx.com "PUT /prom/1_test.txt HTTP/1.1" 404 72 0.004 "-" "aws-sdk-cpp/1.8.95/S3/Linux/3.10.0-693.21.1.el7.x86_64 x86_64 GCC/8.3.1" - - - "http://xxxxx.com" Jul 25 17:30:22 apisix-host journal: a.b.c.d - - [25/Jul/2022:09:30:19 +0000] xxxxx.com "GET / HTTP/1.1" 404 47 0.000 "-" "Go-http-client/1.1" - - - "http://xxxxx.com"

At 17:30 all route gone image

After 17:30 route is back, but apisix not reload route config , still 404( route not found) . image

Now i can get all route info via admin-api or etcdctl . I don't know what happend at 17:30 and i can't reproduce it again.

Anyone can tell me why route is gone after route come back but apisix not reload config , still 404 (route not found) ?

Environment

  • APISIX version (run apisix version): /usr/local/openresty/luajit/bin/luajit ./apisix/cli/apisix.lua version 2.11.0

  • Operating system (run uname -a): linux 3.10.0-862.el7.x86_64 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • OpenResty / Nginx version (run openresty -V or nginx -V): nginx version: openresty/1.19.9.1 built by gcc 9.3.1 20200408 (Red Hat 9.3.1-2) (GCC) built with OpenSSL 1.1.1l 24 Aug 2021 TLS SNI support enabled configure arguments: --prefix=/usr/local/openresty/nginx --with-cc-opt='-O2 -DAPISIX_BASE_VER=1.19.9.1.0 -DNGX_LUA_ABORT_AT_PANIC -I/usr/local/openresty/zlib/include -I/usr/local/openresty/pcre/include -I/usr/local/openresty/openssl111/include' --add-module=../ngx_devel_kit-0.3.1 --add-module=../echo-nginx-module-0.62 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2 --add-module=../set-misc-nginx-module-0.32 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.08 --add-module=../srcache-nginx-module-0.32 --add-module=../ngx_lua-0.10.20 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.33 --add-module=../array-var-nginx-module-0.05 --add-module=../memc-nginx-module-0.19 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.7 --add-module=../ngx_stream_lua-0.0.10 --with-ld-opt='-Wl,-rpath,/usr/local/openresty/luajit/lib -Wl,-rpath,/usr/local/openresty/wasmtime-c-api/lib -L/usr/local/openresty/zlib/lib -L/usr/local/openresty/pcre/lib -L/usr/local/openresty/openssl111/lib -Wl,-rpath,/usr/local/openresty/zlib/lib:/usr/local/openresty/pcre/lib:/usr/local/openresty/openssl111/lib' --add-module=/tmp/tmp.jxza47xzEy/openresty-1.19.9.1/../mod_dubbo --add-module=/tmp/tmp.jxza47xzEy/openresty-1.19.9.1/../ngx_multi_upstream_module --add-module=/tmp/tmp.jxza47xzEy/openresty-1.19.9.1/../apisix-nginx-module --add-module=/tmp/tmp.jxza47xzEy/openresty-1.19.9.1/../wasm-nginx-module --add-module=/tmp/tmp.jxza47xzEy/openresty-1.19.9.1/../lua-var-nginx-module --with-poll_module --with-pcre-jit --with-stream --with-stream_ssl_module --with-stream_ssl_preread_module --with-http_v2_module --without-mail_pop3_module --without-mail_imap_module --without-mail_smtp_module --with-http_stub_status_module --with-http_realip_module --with-http_addition_module --with-http_auth_request_module --with-http_secure_link_module --with-http_random_index_module --with-http_gzip_static_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-threads --with-compat --with-stream --with-http_ssl_module

  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info): {"etcdserver":"3.5.3","etcdcluster":"3.5.0"}

  • APISIX Dashboard version, if relevant: 2.13.0

  • Plugin runner version, for issues related to plugin runners:

  • LuaRocks version, for installation issues (run luarocks --version):

oneto1 avatar Jul 26 '22 04:07 oneto1

my apisix config

apisix:
  node_listen:               # APISIX listening port
    - port: 8000
  enable_ipv6: false

  allow_admin:                  # http://nginx.org/en/docs/http/ngx_http_access_module.html#allow
    - 0.0.0.0/0              # We need to restrict ip access rules for security. 0.0.0.0/0 is for test.

  admin_key:
    - name: "admin"
      key: bd01be15617a
      role: admin                 # admin: manage all configuration data
                                  # viewer: only can view configuration data
  enable_control: true
  control:
    ip: "0.0.0.0"
    port: 9092
  ssl:
    enable: true
    listen:
      - 443
etcd:
  host:                           # it's possible to define multiple etcd hosts addresses of the same etcd cluster.
    - "http://etcda:20000"
    - "http://etcdb:20000"
    - "http://etcdc:20000"
  prefix: "/apisix"               # apisix configurations prefix
  timeout: 30                     # 30 seconds

plugin_attr:
  prometheus:
    export_addr:
      ip: "0.0.0.0"
      port: 9091

oneto1 avatar Jul 26 '22 04:07 oneto1

Before route failed I use apisix-dashbroad to modify same route config about

What does this mean?

From the logs, it does look like APISIX is returning a 404.

Anyone can tell me why route is gone after route come back but apisix not reload config , still 404 (route not found) ?

Are any related logs about this in error.log?

tzssangglass avatar Jul 27 '22 02:07 tzssangglass

Before route failed I use apisix-dashbroad to modify same route config about What does this mean?

I do some ops before problem happened , maybe it is related but early .

Are any related logs about this in error.log?

With docker install all log are redirect to journald , just like i paste it . No any useful log .


From monitor etcd key have no change at problom time .But apisix route really failed . 图片

oneto1 avatar Jul 27 '22 03:07 oneto1

Are you using the prometheus plugin on global rules?

tzssangglass avatar Jul 27 '22 08:07 tzssangglass

Are you using the prometheus plugin on global rules?

Yes. It's work fine about 2 weeks.

oneto1 avatar Jul 27 '22 08:07 oneto1

Can you provide the reproduction steps?

tzssangglass avatar Jul 27 '22 11:07 tzssangglass

I will enable info log and test again .

oneto1 avatar Jul 28 '22 02:07 oneto1

This issue has been marked as stale due to 350 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Jul 15 '23 10:07 github-actions[bot]

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

github-actions[bot] avatar Jul 29 '23 10:07 github-actions[bot]