nginx-module-vts icon indicating copy to clipboard operation
nginx-module-vts copied to clipboard

shared memory zone "ngx_http_vhost_traffic_status" was locked

Open takakawa opened this issue 7 years ago • 2 comments

hello,I met with some issues below:

2018/10/31 15:07:38 [alert] 11345#0: shared memory zone "ngx_http_vhost_traffic_status" was locked by 12464
2018/10/31 15:07:39 [alert] 11345#0: worker process 12471 exited on signal 11
2018/10/31 15:07:39 [alert] 11345#0: shared memory zone "ngx_http_vhost_traffic_status" was locked by 12471
2018/10/31 15:07:40 [alert] 11345#0: worker process 12479 exited on signal 11
2018/10/31 15:07:40 [alert] 11345#0: shared memory zone "ngx_http_vhost_traffic_status" was locked by 12479
2018/10/31 15:07:41 [alert] 11345#0: worker process 12455 exited on signal 11
2018/10/31 15:07:41 [alert] 11345#0: shared memory zone "ngx_http_vhost_traffic_status" was locked by 12455

my nginx version is:

nginx version: nginx/1.14.0 built by gcc 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) built with OpenSSL 1.0.2k-fips 26 Jan 2017 TLS SNI support enabled configure arguments: --prefix=/app/nginx --sbin-path=/app/nginx/sbin/ --conf-path=/app/nginx/etc/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-mail --with-mail_ssl_module --with-file-aio --with-http_v2_module --with-stream --with-stream_ssl_module --add-module=../modules/nginx_upstream_check_module-master/ --with-cc-opt='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic' --with-ld-opt=-Wl,-rpath,/usr/lib64 --add-module=../modules/ngx_devel_kit-0.3.0 --add-module=../modules/lua-nginx-module-0.10.9 --add-module=../modules/ngx_postgres-1.0 --add-module=../modules/nginx-sflow-module-release-0.9.11 --add-module=../modules/ngx_txid-f1c197c --add-module=../modules/nginx-module-vts-0.1.18 --add-module=../modules/ngx_dynamic_upstream-0.1.6

takakawa avatar Oct 31 '18 07:10 takakawa

there is no coredump file,so I debuged nginx and found:

Program received signal SIGSEGV, Segmentation fault.
ngx_http_vhost_traffic_status_display_set_upstream_group (r=r@entry=0xae6b90, buf=0x7fffd16960ac "",
    buf@entry=0x7fffd168949a "\"abtest_lb_core\":[{\"server\":\"10.191.161.129:18223\",\"requestCounter\":0,\"inBytes\":0,\"outBytes\":0,\"responses\":{\"1xx\":0,\"2xx\":0,\"3xx\":0,\"4xx\":0,\"5xx\":0},\"requestMsecCounter\":0,\"requestMsec\":0,\"requestMsec"...) at ../modules/nginx-module-vts-0.1.18/src/ngx_http_vhost_traffic_status_display_json.c:622
622	                p = ngx_cpymem(p, us[j].addrs->name.data, us[j].addrs->name.len);

I print some of the local variable and it seems that the following may cause the problems,but I dont kown why

upstream psql_putong_core  {
  postgres_server  127.0.0.1:6432 dbname=test user=test password=test;
}

takakawa avatar Nov 12 '18 10:11 takakawa

@takakawa Does it still happen in the VTS latest version and Nginx(>1.18.0)?

vozlt avatar Jan 25 '21 14:01 vozlt

I have the same problem,

OS Debian 11 nginx 1.24.0 mod-vts: 0.2.2 mod-postgres 1.0

This happens if mod-postgres upstream config exists.

http {
    geoip_country /usr/share/GeoIP/GeoIP.dat;
    vhost_traffic_status_zone;
    vhost_traffic_status_dump /var/lib/nginx/vhost-traffic-status/db;
    vhost_traffic_status_filter_by_set_key $geoip_country_code country::*;
  
    upstream pgdb {
        postgres_server 127.0.0.1 dbname=dbname user=dbuser password=dbpw;
    }

    server {
        location /status {
            vhost_traffic_status_bypass_stats on;
            vhost_traffic_status_display;
            vhost_traffic_status_display_format html;
        }
    }

}

then every /status/format/json fails and the error log have the following entries on each request:

2023/05/31 23:29:22 [alert] 2804783#2804783: worker process 2804784 exited on signal 11
2023/05/31 23:29:22 [alert] 2804783#2804783: shared memory zone "ngx_http_vhost_traffic_status" was locked by 2804784
2023/05/31 23:29:22 [alert] 2804783#2804783: worker process 2804805 exited on signal 11
2023/05/31 23:29:22 [alert] 2804783#2804783: shared memory zone "ngx_http_vhost_traffic_status" was locked by 2804805
2023/05/31 23:29:23 [alert] 2804783#2804783: worker process 2804822 exited on signal 11
2023/05/31 23:29:23 [alert] 2804783#2804783: shared memory zone "ngx_http_vhost_traffic_status" was locked by 2804822

With GDB I could got some info:

(gdb) backtrace full
#0  memcpy (__len=<error reading variable: Cannot access memory at address 0x1548>, __src=<error reading variable: Cannot access memory at address 0x1550>, __dest=0x557f7d7c2136) at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:34
No locals.
#1  ngx_http_vhost_traffic_status_display_set_upstream_group (r=r@entry=0x557f7d7c01c0, buf=0x557f7d7cd55e "", buf@entry=0x557f7d7cd551 "\"zsysadmin\":[") at ././src/ngx_http_vhost_traffic_status_display_json.c:654
        len = <optimized out>
        p = 0x557f7d7c2136 "\376\a\235\363\326\315\b\245|JRY\320\375\313EF\264\303\027$\366\271Z\256\230\266\266\t\222\036\230\230s"
        o = 0x557f7d7cd551 "\"zsysadmin\":["
        s = 0x557f7d7cd55e ""
        hash = <optimized out>
        type = 2
        zone = <optimized out>
        rc = <optimized out>
        key = {len = 94006054450067, data = 0x557f7bf1fd91 <ngx_sprintf+161> "H\213T$\030dH+\024%("}
        dst = {len = 80, data = 0x557f7d7c212c "zsysadmin\037\376\a\235\363\326\315\b\245|JRY\320\375\313EF\264\303\027$\366\271Z\256\230\266\266\t\222\036\230\230s"}
        i = 0
        j = 0
        k = 0
        node = <optimized out>
        us = 0x7ffd05254660
        usn = {name = {len = 94006053300728, data = 0x1 <error: Cannot access memory at address 0x1>}, addrs = 0x1538, naddrs = 9, weight = 94006053300353, max_conns = 9, max_fails = 94006053300368, fail_timeout = 9, slow_start = 94006053300387, down = 0, 
          backup = 0, spare = {0, 0, 0, 0, 0, 0}}
        peer = <optimized out>
        peers = <optimized out>
        uscf = 0x557f7d695090
        uscfp = 0x557f7d718bf8
        umcf = 0x557f7d684168
        ctx = 0x557f7d685f78
        vtsn = <optimized out>
#2  0x00007f4a3cb4086f in ngx_http_vhost_traffic_status_display_set (r=r@entry=0x557f7d7c01c0, buf=0x557f7d7cd551 "\"zsysadmin\":[") at ././src/ngx_http_vhost_traffic_status_display_json.c:867
        o = 0x557f7d7cd540 "\"upstreamZones\":{\"zsysadmin\":["
        s = 0x557f7d7cd551 "\"zsysadmin\":["
        node = 0x7f4a391ae000
        ctx = <optimized out>
        vtscf = 0x557f7d6ce660
#3  0x00007f4a3cb3e255 in ngx_http_vhost_traffic_status_display_handler_default (r=0x557f7d7c01c0) at ././src/ngx_http_vhost_traffic_status_display.c:405
        len = <optimized out>
        size = <optimized out>
        rc = <optimized out>
        b = 0x557f7d7c1a30
        out = {buf = 0x0, next = 0x557f7d6cf978}
        shpool = 0x7f4a391aa000
        o = <optimized out>
        s = <optimized out>
        type = {len = <optimized out>, data = <optimized out>}
        ctx = <optimized out>
        p = <optimized out>
        uri = {len = <optimized out>, data = <optimized out>}
        euri = {len = 94006053490576, data = 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff>}
        format = <optimized out>
        vtscf = 0x557f7d6ce660
        len = <optimized out>
        o = <optimized out>
        s = <optimized out>
        p = <optimized out>
        uri = {len = <optimized out>, data = <optimized out>}
        euri = {len = <optimized out>, data = <optimized out>}
        type = {len = <optimized out>, data = <optimized out>}
        size = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--c

and her is the uscf content

(gdb) p *uscf
$4 = {peer = {init_upstream = 0x7f4a3cb2aa00 <ngx_postgres_upstream_init>, init = 0x7f4a3cb2a860 <ngx_postgres_upstream_init_peer>, data = 0x0}, srv_conf = 0x557f7d695110, servers = 0x557f7d69e8b0, flags = 319, host = {len = 9, data = 0x557f7d69507f "pgdb"}, 
  file_name = 0x557f7d694f77 "/etc/nginx/conf.d/pgdb.conf", line = 1, port = 0, no_port = 1, shm_zone = 0x0}

Is this help to solve the issue?

zsalab avatar May 31 '23 23:05 zsalab