nchan icon indicating copy to clipboard operation
nchan copied to clipboard

Memory leak when using EventSource ping interval

Open vbogolepov opened this issue 1 year ago • 3 comments

We decided to try the EventSource feature (nchan_eventsource_ping_interval and nchan_eventsource_ping_data parameters) to periodically publish ping messages but found a memory leak.

The issue is present in all versions from v1.2.7 to v1.3.6. We used the following config files for nginx: nginx_config.zip

To subscribe to publications we used SSEClient on python: eventsubscriber.zip Usage example: 'python3.8 ./eventsubscriber.py -u http://192.168.1.36:8083/eventing/subscribe To publish data we used curl, example:while true; do curl -d "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789" -X POST http://192.168.1.36:8083/eventing/publish; sleep 1; done

For release-1.25.2.tar.gz and nchan-1.3.6.tar.gz the following configuration parameters were used: auto/configure --with-cc-opt="-O2 -g --param=ssp-buffer-size=4 -Wformat -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2" --with-ld-opt="-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,--as-needed" --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-http_mp4_module --with-http_perl_module --with-http_random_index_module --with-http_xslt_module --with-http_geoip_module --with-mail --with-mail_ssl_module --with-http_v2_module --with-threads --with-stream --with-stream_ssl_module --with-http_slice_module --with-debug --with-pcre-jit --add-module="nchan-1.3.6" && make -j8

OS and compiler version: OS: Ubuntu 20.04.6 LTS gcc:9.4.0 Also the issue reproduced on MIPS architecture, gcc 7.5.0

To determine the memory leak, the utility procrank was used.

If do not use the EventSource ping interval feature, then there are no issues with memory leak. To do this, need to remove the following lines from eventing.conf nginx config file: nchan_eventsource_ping_interval 1; nchan_eventsource_ping_data '';

vbogolepov avatar Sep 29 '23 06:09 vbogolepov

Facing the same issue. But I need this feature to not disconnect from clients.

Web Socket ping pong doesn't work

hrithwikbharadwaj avatar Nov 25 '23 06:11 hrithwikbharadwaj

I don't see a memory leak, I see more buffers being used correctly. Procrank shows memory usage, not necessarily leaks.

I would consider this an issue if you are seeing steady increase in memory usage for a fixed number of subscribers after 30 mins or so. Please let me know if this is what you are observing.

slact avatar Feb 26 '24 19:02 slact

I don't see a memory leak, I see more buffers being used correctly. Procrank shows memory usage, not necessarily leaks.

I would consider this an issue if you are seeing steady increase in memory usage for a fixed number of subscribers after 30 mins or so. Please let me know if this is what you are observing.

Thanks, we will try to implement a test that will use a fixed number of subscribers for 30 minutes.

We think the following lines are suspicious: https://github.com/slact/nchan/compare/v1.2.6...v1.2.7#diff-7fda0fab9fa848edb729abc03d7218e1d489812a0a27d0d7f23f6a08a4ed2466R335 For the ping chain, it sets last_in_chain to 1 and last_buf is not explicitly set to 0 or 1. While for es_respond_message last_in_chain is not explicitly set to 0 or 1 and last_buf is explicitly set to 0. So, the chain that is handed to the output might be differently terminated, which could support the theory that data that comes after the ping is not freed.

These chains are nginx types and we guess nginx decides when to free the buffers allocated on them. Maybe there is a bug in our used version of nginx, or nchan is just using it incorrectly for the pings. @slact Could you please take a look at the diff that is described above.

vbogolepov avatar Apr 10 '24 18:04 vbogolepov