haproxy
haproxy copied to clipboard
HAProxy 2.9.4 Randomly Crashes
Detailed Description of the Problem
We have recently tried to upgrade to HAProxy 2.9.4. Upon doing so, we noticed that HAProxy randomly crashed roughly once per day. We have not noticed this behavior in the version we are upgrading from, which is 2.8.0.
Expected Behavior
HAProxy should not crash.
Steps to Reproduce the Behavior
We cannot reproduce the behavior.
Do you have any idea what may have caused this?
The HAProxy version 2.9.4 has only been deployed on two hosts in our staging environment. These hosts do not receive any traffic, except some clients sending health check requests. To serve each health check request, HAProxy calls spoa to verify if spoa is up or not. From the stacktrace, it seems that some cleanup related to spoa is being done, but we are not exactly sure what could be causing the crash or if it is related.
Do you have an idea how to solve the issue?
No response
What is your configuration?
global
log ring@log_buffer len 65535 local3
stats timeout 2m
close-spread-time 10m
hard-stop-after 11m
no-memory-trimming
set-dumpable
strict-limits
maxconn 1000000
master-worker
nbthread 64
chroot ./chroot_empty
user nobody
group nobody
stats socket ./haproxy.sock mode 770 user app group app level admin
program my_spoa
command /app/sidecar /app/sidecar.config
ring log_buffer
description "Buffer for logs"
server s1 127.0.0.1:2222
format rfc3164
maxlen 65535
size 2048000000
timeout connect 1s
timeout server 1s
defaults
timeout connect 66s
timeout client 65s
timeout server 60s
mode http
option httpchk GET /admin
default-server alpn h2,http/1.1
no option logasap
log global
resolvers systemdns
accepted_payload_size 65535
parse-resolv-conf
timeout resolve 3s
frontend stats
bind ipv4@:5555,ipv6@:5555
stats enable
stats uri /stats
stats refresh 10s
frontend public
bind ipv4@:4444,ipv6@:4444
bind ipv4@:6789,ipv6@:6789 accept-proxy
bind ipv4@:7777,ipv6@:7777 ssl crt /app/certs alpn http/1.1
bind ipv4@:8888,ipv6@:8888 accept-proxy ssl crt /app/certs alpn http/1.1
option forwardfor
filter spoe engine primary_spoa config /app/my_spoa.conf
acl is_admin_endpoint path -i -m str /admin
acl is_req_from_localhost src 127.0.0.1
use_backend admin-healthcheck if is_admin_endpoint
default_backend DefaultBackend
backend DefaultBackend
server server1 127.0.0.1:2345
backend admin-healthcheck
filter spoe engine my_spoa config /app/my_spoa.conf
http-request send-spoe-group my_spoa adminhc
http-request return status 404 content-type text/plain string BAD if !{ var(txn.my_spoa.adminhc.status) -m str GOOD }
http-request return status 200 content-type text/plain string GOOD if { str(status),map(/app/hc.map) -i -m str GOOD }
http-request return status 404 content-type text/plain string BAD
Output of haproxy -vv
HAProxy version 2.9.4-efc6b7-1 2024/01/31 - https://haproxy.org/
Status: stable branch - will stop receiving fixes around Q1 2025.
Known bugs: http://www.haproxy.org/bugs/bugs-2.9.4.html
Running on: Linux 5.15.125.1-2.cm2 #1 SMP Fri Aug 11 02:17:58 UTC 2023 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = cc
CFLAGS = -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment
OPTIONS = USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
DEBUG = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS
Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_AWSLC -OPENSSL_WOLFSSL -OT +PCRE -PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION -QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=64).
Built with OpenSSL version : OpenSSL 1.1.1k 25 Mar 2021
Running on OpenSSL version : OpenSSL 1.1.1k FIPS 25 Mar 2021
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.4.4
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.45 2021-06-15
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 4.8.5 20150623 (Red Hat 4.8.5-44)
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
<default> : mode=HTTP side=FE|BE mux=H1 flags=HTX
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
<default> : mode=TCP side=FE|BE mux=PASS flags=
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
Available services : none
Available filters :
[BWLIM] bwlim-in
[BWLIM] bwlim-out
[CACHE] cache
[COMP] compression
[FCGI] fcgi-app
[SPOE] spoe
[TRACE] trace
Last Outputs and Backtraces
We do not see any error messages in our logs. Below is the output of `bt full`. Please see the attached file in the next section for the full output of `t a a bt full`.
warning: Can't open file /dev/zero (deleted) during file-backed mapping note processing
[New LWP 3102656]
[New LWP 3102618]
[New LWP 3102619]
[New LWP 3102614]
[New LWP 3102613]
[New LWP 3102643]
[New LWP 3102623]
[New LWP 3102622]
[New LWP 3102615]
[New LWP 3102624]
[New LWP 3102625]
[New LWP 3102626]
[New LWP 3102621]
[New LWP 3102627]
[New LWP 3102628]
[New LWP 3102631]
[New LWP 3102629]
[New LWP 3102620]
[New LWP 3102617]
[New LWP 3102632]
[New LWP 3102633]
[New LWP 3102634]
[New LWP 3102635]
[New LWP 3102637]
[New LWP 3102630]
[New LWP 3102638]
[New LWP 3102640]
[New LWP 3102641]
[New LWP 3102644]
[New LWP 3102645]
[New LWP 3102646]
[New LWP 3102648]
[New LWP 3102647]
[New LWP 3102650]
[New LWP 3102649]
[New LWP 3102651]
[New LWP 3102658]
[New LWP 3102639]
[New LWP 3102653]
[New LWP 3102662]
[New LWP 3102659]
[New LWP 3102654]
[New LWP 3102667]
[New LWP 3102657]
[New LWP 3102655]
[New LWP 3102661]
[New LWP 3102669]
[New LWP 3102670]
[New LWP 3102671]
[New LWP 3102675]
[New LWP 3102672]
[New LWP 3102676]
[New LWP 3102666]
[New LWP 3102616]
[New LWP 3102664]
[New LWP 3102660]
[New LWP 3102674]
[New LWP 3102673]
[New LWP 3102665]
[New LWP 3102663]
[New LWP 3102668]
[New LWP 3102652]
[New LWP 3102642]
[New LWP 3102636]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Program terminated with signal SIGABRT, Aborted.
#0 0x00007bac4e185e6c in __pthread_kill_implementation () from /lib/libc.so.6
[Current thread is 1 (Thread 0x7baba5520640 (LWP 3102656))]
(gdb) bt full
#0 0x00007bac4e185e6c in __pthread_kill_implementation () from /lib/libc.so.6
No symbol table info available.
#1 0x00007bac4e13ad86 in raise () from /lib/libc.so.6
No symbol table info available.
#2 0x00007bac4e1257e5 in abort () from /lib/libc.so.6
No symbol table info available.
#3 0x00000000005b43cd in ha_panic () at src/debug.c:620
old = 0x0
thr = <optimized out>
#4 <signal handler called>
No symbol table info available.
#5 pl_wait_new_long (lock=lock@entry=0x22b3a80, prev=prev@entry=1)
at include/import/plock.h:216
loops = 202 '\312'
m = <optimized out>
curr = 203
#6 0x00000000005d63c1 in pool_flush (pool=0x22b37c0) at src/pool.c:765
next = <optimized out>
temp = <optimized out>
down = <optimized out>
bucket = <optimized out>
#7 0x000000000051fe12 in __task_free (t=<optimized out>)
at include/haproxy/task.h:628
__ptr = <optimized out>
#8 task_destroy (t=0x7baa9005d380) at include/haproxy/task.h:651
No locals.
#9 spoe_release_appctx (appctx=<optimized out>) at src/flt_spoe.c:1300
spoe_appctx = 0x7baa9005d080
agent = 0x11bcdabf0
ctx = <optimized out>
back = <optimized out>
__func__ = "spoe_release_appctx"
#10 0x000000000060f1d1 in appctx_shut (appctx=0x7baa9005d160)
at src/applet.c:361
__FUNCTION__ = "appctx_shut"
#11 0x000000000059272c in sc_app_shut_applet (sc=0x7baa9005d500)
at src/stconn.c:931
ic = <optimized out>
oc = <optimized out>
__func__ = "sc_app_shut_applet"
#12 0x00000000004d8527 in sc_shutdown (sc=0x7baa9005d500)
at include/haproxy/sc_strm.h:448
No locals.
#13 process_stream (t=<optimized out>, context=0x7baa9005d560,
state=<optimized out>) at src/stream.c:2479
srv = <optimized out>
s = 0x7baa9005d560
sess = <optimized out>
scf_flags = 295010
scb_flags = 393251
rqf_last = 8683520
rpf_last = 2147778817
rq_prod_last = 8
rq_cons_last = 10
rp_cons_last = 8
rp_prod_last = 10
req_ana_back = 0
res_ana_back = 0
req = 0x7baa9005d580
res = 0x7baa9005d5c8
scf = <optimized out>
scb = <optimized out>
rate = <optimized out>
resync_stconns = <optimized out>
resync_request = <optimized out>
resync_response = <optimized out>
#14 0x00000000005c05f0 in run_tasks_from_lists (
budgets=budgets@entry=0x7baba54fd330) at src/task.c:632
process = <optimized out>
tl_queues = <optimized out>
t = 0x7baa9005d920
budget_mask = 15 '\017'
profile_entry = 0x0
done = 0
queue = 1
state = 1028
ctx = <optimized out>
__func__ = "run_tasks_from_lists"
#15 0x00000000005c0dd1 in process_runnable_tasks () at src/task.c:876
tt = 0xb7ce80 <ha_thread_ctx+22016>
lrq = <optimized out>
grq = <optimized out>
t = <optimized out>
max = {0, 33, 0, 0}
max_total = <optimized out>
tmp_list = <optimized out>
queue = <optimized out>
max_processed = <optimized out>
lpicked = <optimized out>
gpicked = <optimized out>
heavy_queued = 1
budget = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
not_done_yet = <optimized out>
#16 0x000000000058b362 in run_poll_loop () at src/haproxy.c:3049
next = <optimized out>
wake = <optimized out>
__func__ = "run_poll_loop"
#17 0x000000000058b9a6 in run_thread_poll_loop (data=<optimized out>)
at src/haproxy.c:3251
ptaf = <optimized out>
ptif = 0x93bbb0 <per_thread_init_list>
ptdf = <optimized out>
ptff = <optimized out>
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0,
__nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {
__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__lock = 883, __futex = 0, __total_seq = 829,
__wakeup_seq = 0, __woken_seq = 0, __mutex = 0x6c,
__nwaiters = 0, __broadcast_seq = 0},
__size = "s\003\000\000\000\000\000\000=\003", '\000' <repeats 22 times>, "l", '\000' <repeats 14 times>, __align = 883}
#18 0x00007bac4e1841d2 in start_thread () from /lib/libc.so.6
No symbol table info available.
#19 0x00007bac4e2060b0 in clone3 () from /lib/libc.so.6
No symbol table info available.
Additional Information
haproxy -vv of the older version that is stable:
HAProxy version 2.8.0-b2537a-17 2023/12/07 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2028.
Known bugs: http://www.haproxy.org/bugs/bugs-2.8.0.html
Running on: Linux 5.15.125.1-2.cm2 #1 SMP Fri Aug 11 02:17:58 UTC 2023 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = cc
CFLAGS = -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment
OPTIONS = USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
DEBUG = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS
Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_SPLICE +LINUX_TPROXY +LUA +MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_WOLFSSL -OT +PCRE -PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION -QUIC +RT +SHM_OPEN +SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL -ZLIB
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=64).
Built with OpenSSL version : OpenSSL 1.1.1k 25 Mar 2021
Running on OpenSSL version : OpenSSL 1.1.1k FIPS 25 Mar 2021
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.4.4
Built with network namespace support.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE version : 8.32 2012-11-30
Running on PCRE version : 8.45 2021-06-15
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 4.8.5 20150623 (Red Hat 4.8.5-44)
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
<default> : mode=HTTP side=FE|BE mux=H1 flags=HTX
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
<default> : mode=TCP side=FE|BE mux=PASS flags=
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
Available services : none
Available filters :
[BWLIM] bwlim-in
[BWLIM] bwlim-out
[CACHE] cache
[COMP] compression
[FCGI] fcgi-app
[SPOE] spoe
[TRACE] trace
We have applied one local patch in 2.9.4. That patch is currently under discussion on the mailing list here.
Finally, we have also tried to reproduce the crash by sending the health check mentioned above, but it does not reproduce the crash. gdb.txt
Ok so apparently process is crashing due to ha_panic() being called as a result of thread contention.
According to your trace and version, nearly all threads are stuck here (45 out of 64):
#1 0x00000000005d63c1 in pool_flush (pool=0x22b37c0) at src/pool.c:765
next = <optimized out>
temp = <optimized out>
down = <optimized out>
bucket = <optimized out>
#2 0x00000000004d9d00 in __task_free (t=<optimized out>) at include/haproxy/task.h:628
__ptr = <optimized out>
Which corresponds to this:
static inline void __task_free(struct task *t)
{
if (t == th_ctx->current) {
th_ctx->current = NULL;
__ha_barrier_store();
}
BUG_ON(task_in_wq(t) || task_in_rq(t));
BUG_ON((ulong)t->caller & 1);
#ifdef DEBUG_TASK
HA_ATOMIC_STORE(&t->debug.prev_caller, HA_ATOMIC_LOAD(&t->caller));
#endif
HA_ATOMIC_STORE(&t->caller, (void*)1); // make sure to crash if used after free
pool_free(pool_head_task, t);
th_ctx->nb_tasks--;
if (unlikely(stopping))
pool_flush(pool_head_task); /* ======= line 628: STUCK HERE */
}
So this would suggest that the contention occurs during soft-stop. Do you know if your haproxy process is restarted either manually of from a script every once in a while? You mentioned that it was crashing roughly once time a day, so maybe this could coincide with a restart performed once per day?
Now to explain why the crash occurs in 2.9.4 and not on the last 2.8 version you tested (which 2.8 exact version was it by the way?), maybe this could have something to do with 72c23bd4cd3ee3a79955fbdd601206a9fa7d19eb (most recent spoe code change) or recent changes around pools code to reduce contention with multiple buckets introduced in 2.9.
Perhaps in this case some spoe resources are slowly piling up during process runtime, and when soft-stop occurs, all those resources are scheduled for cleanup and there is too much at once for haproxy to keep up?
We were able to somewhat reproduce the crash by restarting haproxy and seeing the crash during shut down.
Not every restart caused the crash though.
- if after a restart I do another restart within less than 30 minutes, there is no crash. But if I do the restart after a couple of hours, then there is a crash.
- hosts without the spoa do not seem to crash on restart but the hosts with spoa did crash on restart after running a couple of hours. So it seems the pile up is related to spoa being there.
We plan to run this spoa at much higher requests per second - so this issue seems quite concerning.
Perhaps in this case some spoe resources are slowly piling up during process runtime
I am wondering why we would have such "excessive" pile up that is too much for haproxy to keep up with. And what is it that we are doing wrong. Do you have any suggestions on things we should try to understand the pile up? e.g. any commands we can run before restarting. So that we can do the right fix.
I am trying to compare the "show pools" output of two hosts - one without spoa and one with spoa; I can see that some pools have a big difference. But I am not sure at what point the build up can be termed "excessive".
There's something really odd here. It's not expected to spend that much time like this on an entry in a pool, particularly in this version where entries are sharded to further reduce contention.
I'm just wondering if we don't have a bug here in pool_flush() in case multiple threads call it at once. Indeed, the
do { ... } while ((ret = xchg(BUSY)) == BUSY) may occasionally replace a NULL with a BUSY if the value changes between the test and the xchg() call. But the following test replaces it with NULL only if ret is not null, so unless I'm missing something, we may occasionally replace a NULL with BUSY since commit 2a4523f6f ("BUG/MAJOR: pools: fix possible race with free() in the lockless variant") merged in 2.5.
I need to study this more carefully to make sure I'm not missing anything. As Aurélien mentioned, it's possible that a recent change puts a bit more stress on some pool_free() calls and more easily triggered the issue.
I'll likely merge the attached patch, you may want to try to apply it. pool.diff.txt
OK that was it, I could also reproduce it by modifying the code, and I can confirm the fix works. I've merged it now and will backport it. In the mean time, you can work around this by disabling the global pools (it's usually disabled except on some rare distros where it's unknown whether malloc() is fast). For this, just add -dMno-global on the command line and you won't hit that bug anymore.
Hey @wtarreau and @Darlelet , I wanted to confirm that this fix is indeed working, and also thank you for the time and effort you put into it, as well as for the quick response. We really appreciate your efforts.