haproxy
haproxy copied to clipboard
Segfault in Lua / "in_table"
Detailed Description of the Problem
After upgrading from 1.8 to 2.4 we've observed occasional segfaults. After some time we've now finally gotten a coredump.
The crash seems to happen in our Lua code with the call to txn.c:in_table.
I can also reproduce this crash easily with the latest commit on the development branch (2.7-dev5-439be5-83 2022/09/12). With 1.8 I cannot reproduce the crash.
Luckily I was able to come up with a minimal configuration which allows me to reproduce this crash easily (see below).
Expected Behavior
No segmentation fault.
Steps to Reproduce the Behavior
Start HAProxy with the config below and send any request:
$ curl http://localhost:8080
curl: (52) Empty reply from server
Do you have any idea what may have caused this?
No response
Do you have an idea how to solve the issue?
No response
What is your configuration?
# crash.lua:
core.register_action("crash", { "http-req" }, function(txn)
txn.c:in_table("test")
end);
# haproxy.cfg:
global
lua-load crash.lua
frontend main
mode http
bind *:8080
stick-table type string len 50 size 1000 expire 1m
http-request lua.crash
Output of haproxy -vv
Linux name 5.15.0-47-generic #51-Ubuntu SMP Thu Aug 11 07:51:15 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
HAProxy version 2.7-dev5-439be5-83 2022/09/12 - https://haproxy.org/
Status: development branch - not safe for use in production.
Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
Running on: Linux 5.15.0-47-generic #51-Ubuntu SMP Thu Aug 11 07:51:15 UTC 2022 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = cc
CFLAGS = -O2 -g -O0 -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment
OPTIONS = USE_LUA=1
DEBUG = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS
Feature list : +EPOLL -KQUEUE +NETFILTER -PCRE -PCRE_JIT -PCRE2 -PCRE2_JIT +POLL +THREAD -PTHREAD_EMULATION +BACKTRACE -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H -ENGINE +GETADDRINFO -OPENSSL +LUA +ACCEPT4 -CLOSEFROM -ZLIB +SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER +PRCTL -PROCCTL +THREAD_DUMP -EVPORTS -OT -QUIC -PROMEX -MEMORY_PROFILING
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=8).
Built with Lua version : Lua 5.4.4
Built with network namespace support.
Support for malloc_trim() is enabled.
Built with libslz for stateless compression.
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built without PCRE or PCRE2 support (using libc's regex instead)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 11.2.0
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
<default> : mode=HTTP side=FE|BE mux=H1 flags=HTX
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
<default> : mode=TCP side=FE|BE mux=PASS flags=
Available services : none
Available filters :
[BWLIM] bwlim-in
[BWLIM] bwlim-out
[CACHE] cache
[COMP] compression
[FCGI] fcgi-app
[SPOE] spoe
[TRACE] trace
Last Outputs and Backtraces
# Last output:
[NOTICE] (423503) : haproxy version is 2.7-dev5-439be5-83
[NOTICE] (423503) : path to executable is ./haproxy
[WARNING] (423503) : config : missing timeouts for frontend 'main'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
Segmentation fault (core dumped)
# gdb:
$ gdb haproxy core.haproxy.423503
GNU gdb (Ubuntu 12.0.90-0ubuntu1) 12.0.90
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from haproxy...
[New LWP 423503]
[New LWP 423508]
[New LWP 423505]
[New LWP 423509]
[New LWP 423506]
[New LWP 423507]
[New LWP 423510]
[New LWP 423511]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./haproxy -f haproxy2.cfg'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000056551966d714 in sample_convert (sample=0x7fff1fe0c840, req_type=446250792) at include/haproxy/sample.h:70
70 if (!sample_casts[sample->data.type][req_type])
[Current thread is 1 (Thread 0x7f2dd40e2f00 (LWP 423503))]
(gdb) t a a bt full
Thread 8 (Thread 0x7f2dc27fc640 (LWP 423511)):
#0 0x00007f2dd42f3fde in epoll_wait (epfd=24, events=0x7f2db0025660, maxevents=200, timeout=369) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_cancel_oldtype = 0
sc_ret = <optimized out>
#1 0x0000565519579b65 in _do_poll (p=0x565519920800 <cur_poller>, exp=860791628, wake=0) at src/ev_epoll.c:232
timeout = 369
status = -1031813568
fd = -1
count = 22101
updt_idx = 0
wait_time = 369
old_fd = -1
#2 0x00005655197467da in run_poll_loop () at src/haproxy.c:2878
next = 860791628
wake = 0
__func__ = <optimized out>
#3 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e4c0 <ha_thread_info+448>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x0
ptff = 0x0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#4 0x00007f2dd4262b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733728214944, 3697673187556459400, 139834513409600, 0, 139834809526352, 140733728215296, -3671767599692512376, -3671719712237550712}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#5 0x00007f2dd42f4a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
Thread 7 (Thread 0x7f2dc2ffd640 (LWP 423510)):
#0 0x00007f2dd42f3fde in epoll_wait (epfd=27, events=0x7f2da4025660, maxevents=200, timeout=60000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_cancel_oldtype = 0
sc_ret = <optimized out>
#1 0x0000565519579b65 in _do_poll (p=0x565519920800 <cur_poller>, exp=0, wake=0) at src/ev_epoll.c:232
timeout = 60000
status = -1023420864
fd = -1
count = 0
updt_idx = 2
wait_time = 60000
old_fd = 4
#2 0x00005655197467da in run_poll_loop () at src/haproxy.c:2878
--Type <RET> for more, q to quit, c to continue without paging--c
next = 0
wake = 0
__func__ = <optimized out>
#3 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e480 <ha_thread_info+384>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x0
ptff = 0x0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#4 0x00007f2dd4262b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733728214944, 3697673187556459400, 139834521802304, 0, 139834809526352, 140733728215296, -3671766501791497336, -3671719712237550712}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#5 0x00007f2dd42f4a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
Thread 6 (Thread 0x7f2dd0f91640 (LWP 423507)):
#0 0x00007f2dd42f3fde in epoll_wait (epfd=13, events=0x7f2db4025660, maxevents=200, timeout=60000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_cancel_oldtype = 0
sc_ret = <optimized out>
#1 0x0000565519579b65 in _do_poll (p=0x565519920800 <cur_poller>, exp=0, wake=0) at src/ev_epoll.c:232
timeout = 60000
status = -788982208
fd = -1
count = 0
updt_idx = 2
wait_time = 60000
old_fd = 4
#2 0x00005655197467da in run_poll_loop () at src/haproxy.c:2878
next = 0
wake = 0
__func__ = <optimized out>
#3 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e3c0 <ha_thread_info+192>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x0
ptff = 0x0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#4 0x00007f2dd4262b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733728214944, 3697673187556459400, 139834756240960, 0, 139834809526352, 140733728215296, -3671726874275740792, -3671719712237550712}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#5 0x00007f2dd42f4a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
Thread 5 (Thread 0x7f2dd1792640 (LWP 423506)):
#0 0x00007f2dd42f3fde in epoll_wait (epfd=10, events=0x7f2dbc025660, maxevents=200, timeout=60000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_cancel_oldtype = 0
sc_ret = <optimized out>
#1 0x0000565519579b65 in _do_poll (p=0x565519920800 <cur_poller>, exp=0, wake=0) at src/ev_epoll.c:232
timeout = 60000
status = -780589504
fd = -1
count = 0
updt_idx = 2
wait_time = 60000
old_fd = 4
#2 0x00005655197467da in run_poll_loop () at src/haproxy.c:2878
next = 0
wake = 0
__func__ = <optimized out>
#3 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e380 <ha_thread_info+128>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x0
ptff = 0x0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#4 0x00007f2dd4262b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733728214944, 3697673187556459400, 139834764633664, 0, 139834809526352, 140733728215296, -3671730172273753208, -3671719712237550712}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#5 0x00007f2dd42f4a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
Thread 4 (Thread 0x7f2dc37fe640 (LWP 423509)):
#0 0x00007f2dd42f3fde in epoll_wait (epfd=21, events=0x7f2dac025660, maxevents=200, timeout=60000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_cancel_oldtype = 0
sc_ret = <optimized out>
#1 0x0000565519579b65 in _do_poll (p=0x565519920800 <cur_poller>, exp=0, wake=0) at src/ev_epoll.c:232
timeout = 60000
status = -1015028160
fd = -1
count = 0
updt_idx = 2
wait_time = 60000
old_fd = 4
#2 0x00005655197467da in run_poll_loop () at src/haproxy.c:2878
next = 0
wake = 0
__func__ = <optimized out>
#3 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e440 <ha_thread_info+320>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x0
ptff = 0x0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#4 0x00007f2dd4262b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733728214944, 3697673187556459400, 139834530195008, 0, 139834809526352, 140733728215296, -3671769799789509752, -3671719712237550712}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#5 0x00007f2dd42f4a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
Thread 3 (Thread 0x7f2dd40d7640 (LWP 423505)):
#0 0x00007f2dd42f3fde in epoll_wait (epfd=7, events=0x7f2dcc0266a0, maxevents=200, timeout=60000) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_cancel_oldtype = 0
sc_ret = <optimized out>
#1 0x0000565519579b65 in _do_poll (p=0x565519920800 <cur_poller>, exp=0, wake=0) at src/ev_epoll.c:232
timeout = 60000
status = -737315264
fd = -1
count = 0
updt_idx = 2
wait_time = 60000
old_fd = 4
#2 0x00005655197467da in run_poll_loop () at src/haproxy.c:2878
next = 0
wake = 0
__func__ = <optimized out>
#3 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e340 <ha_thread_info+64>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x7f2dd4262850 <start_thread>
ptff = 0x7fff1fe0d0a0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#4 0x00007f2dd4262b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733728214944, 3697673187556459400, 139834807907904, 0, 139834809526352, 140733728215296, -3671719893880142968, -3671719712237550712}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#5 0x00007f2dd42f4a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
Thread 2 (Thread 0x7f2dc3fff640 (LWP 423508)):
#0 0x00007f2dd42f3fde in epoll_wait (epfd=16, events=0x7f2db8025660, maxevents=200, timeout=369) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_cancel_oldtype = 0
sc_ret = <optimized out>
#1 0x0000565519579b65 in _do_poll (p=0x565519920800 <cur_poller>, exp=860791628, wake=0) at src/ev_epoll.c:232
timeout = 369
status = -1006635456
fd = -1
count = 22101
updt_idx = 0
wait_time = 369
old_fd = -1
#2 0x00005655197467da in run_poll_loop () at src/haproxy.c:2878
next = 860791628
wake = 0
__func__ = <optimized out>
#3 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e400 <ha_thread_info+256>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x0
ptff = 0x0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#4 0x00007f2dd4262b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140733728214944, 3697673187556459400, 139834538587712, 0, 139834809526352, 140733728215296, -3671768697593527416, -3671719712237550712}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#5 0x00007f2dd42f4a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.
Thread 1 (Thread 0x7f2dd40e2f00 (LWP 423503)):
#0 0x000056551966d714 in sample_convert (sample=0x7fff1fe0c840, req_type=446250792) at include/haproxy/sample.h:70
No locals.
#1 0x0000565519670261 in smp_to_stkey (smp=0x7fff1fe0c840, t=0x56551a993e80) at src/stick_table.c:1028
No locals.
#2 0x00005655196707cc in sample_conv_in_table (arg_p=0x7fff1fe0c8e0, smp=0x7fff1fe0c840, private=0x0) at src/stick_table.c:1236
t = 0x56551a993e80
key = 0x56551a993e80
ts = 0x7fff1fe0c840
#3 0x0000565519588475 in hlua_run_sample_conv (L=0x56551ab05058) at src/hlua.c:4282
hsmp = 0x56551ab05718
conv = 0x565519910680 <sample_conv_kws+16>
args = {{type = 11 '\v', unresolved = 0 '\000', type_flags = 0 '\000', data = {sint = 94923518459520, str = {size = 94923518459520, area = 0x0, data = 0, head = 0}, ipv4 = {s_addr = 446250624}, ipv6 = {__in6_u = {__u6_addr8 = "\200>\231\032UV\000\000\000\000\000\000\000\000\000", __u6_addr16 = {16000, 6809, 22101, 0, 0, 0, 0, 0}, __u6_addr32 = {446250624, 22101, 0, 0}}}, prx = 0x56551a993e80, srv = 0x56551a993e80, t = 0x56551a993e80, usr = 0x56551a993e80, map = 0x56551a993e80, reg = 0x56551a993e80, fid = {ids = 0x56551a993e80, sz = 0}, var = {name_hash = 94923518459520, scope = SCOPE_SESS}, ptr = 0x56551a993e80}}, {type = 0 '\000', unresolved = 0 '\000', type_flags = 0 '\000', data = {sint = 0, str = {size = 0, area = 0x0, data = 0, head = 0}, ipv4 = {s_addr = 0}, ipv6 = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, prx = 0x0, srv = 0x0, t = 0x0, usr = 0x0, map = 0x0, reg = 0x0, fid = {ids = 0x0, sz = 0}, var = {name_hash = 0, scope = SCOPE_SESS}, ptr = 0x0}} <repeats 12 times>}
i = 0
smp = {flags = 128, data = {type = 6, u = {sint = 5, ipv4 = {s_addr = 5}, ipv6 = {__in6_u = {__u6_addr8 = "\005\000\000\000\000\000\000\000\330\"\231\032UV\000", __u6_addr16 = {5, 0, 0, 0, 8920, 6809, 22101, 0}, __u6_addr32 = {5, 0, 446243544, 22101}}}, str = {size = 5, area = 0x56551a9922d8 "test", data = 4, head = 0}, meth = {meth = HTTP_METH_DELETE, str = {size = 94923518452440, area = 0x4 <error: Cannot access memory at address 0x4>, data = 0, head = 0}}}}, ctx = {p = 0x0, i = 0, ll = 0, d = 0, a = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, px = 0x56551a993e80, sess = 0x56551ab04680, strm = 0x56551ab049b0, opt = 0}
#4 0x00007f2dd4406b95 in ?? () from /lib/x86_64-linux-gnu/liblua5.4.so.0
No symbol table info available.
#5 0x00007f2dd4413702 in ?? () from /lib/x86_64-linux-gnu/liblua5.4.so.0
No symbol table info available.
#6 0x00007f2dd44029a3 in ?? () from /lib/x86_64-linux-gnu/liblua5.4.so.0
No symbol table info available.
#7 0x00007f2dd4403e00 in lua_resume () from /lib/x86_64-linux-gnu/liblua5.4.so.0
No symbol table info available.
#8 0x0000565519581e21 in hlua_ctx_resume (lua=0x56551ab04fa0, yield_allowed=1) at src/hlua.c:1456
nres = 447760816
ret = 22101
msg = 0x56551ab05058 "pƠ\032UV"
trace = 0x7fff1fe0d588 "\225\362\340\037\377\177"
__func__ = <optimized out>
#9 0x0000565519595ad7 in hlua_action (rule=0x56551a991da0, px=0x56551a993e80, sess=0x56551ab04680, s=0x56551ab049b0, flags=2) at src/hlua.c:9155
arg = 0x56551a991e80
hflags = 32
dir = 0
act_ret = 0
error = 0x1000000010000 <error: Cannot access memory at address 0x1000000010000>
#10 0x000056551964ded5 in http_req_get_intercept_rule (px=0x56551a993e80, def_rules=0x0, rules=0x56551a993ec8, s=0x56551ab049b0) at src/http_ana.c:2840
sess = 0x56551ab04680
txn = 0x56551ab04eb0
rule = 0x56551a991da0
rule_ret = HTTP_RULE_RES_CONT
act_opts = 2
#11 0x0000565519646fb5 in http_process_req_common (s=0x56551ab049b0, req=0x56551ab049d0, an_bit=16, px=0x56551a993e80) at src/http_ana.c:379
def_rules = 0x0
rules = 0x56551a993ec8
sess = 0x56551ab04680
txn = 0x56551ab04eb0
msg = 0x56551ab04ec0
htx = 0x56551a9ffef0
rule = 0x56551a9fbe40
verdict = 22101
conn = 0x7f2db0026080
#12 0x0000565519628210 in process_stream (t=0x56551ab045d0, context=0x56551ab049b0, state=260) at src/stream.c:1998
max_loops = 199
ana_list = 48
ana_back = 48
flags = 1078984706
srv = 0x0
s = 0x56551ab049b0
sess = 0x56551ab04680
rqf_last = 1077936128
rpf_last = 2147483648
rq_prod_last = 8
rq_cons_last = 0
rp_cons_last = 8
rp_prod_last = 0
req_ana_back = 0
req = 0x56551ab049d0
res = 0x56551ab04a30
scf = 0x56551aae2d80
scb = 0x56551ab04df0
rate = 0
#13 0x00005655197995c2 in run_tasks_from_lists (budgets=0x7fff1fe0d250) at src/task.c:626
process = 0x565519626e9c <process_stream>
tl_queues = 0x565519b82770 <ha_thread_ctx+112>
t = 0x56551ab045d0
budget_mask = 15 '\017'
profile_entry = 0x0
done = 0
queue = 1
state = 260
ctx = 0x56551ab049b0
__func__ = <optimized out>
#14 0x0000565519799f85 in process_runnable_tasks () at src/task.c:855
tt = 0x565519b82700 <ha_thread_ctx>
lrq = 0x0
grq = 0x0
t = 0x56551ab045d0
default_weights = {64, 48, 16, 1}
max = {0, 90, 0, 0}
max_total = 48
tmp_list = 0x0
queue = 4
max_processed = 91
lpicked = 1
gpicked = 0
heavy_queued = 1
budget = 91
#15 0x0000565519746400 in run_poll_loop () at src/haproxy.c:2807
next = 860791628
wake = 0
__func__ = <optimized out>
#16 0x0000565519746ab1 in run_thread_poll_loop (data=0x565519b7e300 <ha_thread_info>) at src/haproxy.c:2996
ptaf = 0x56551991d540 <per_thread_alloc_list>
ptif = 0x56551991d550 <per_thread_init_list>
ptdf = 0x7fff1fe0d310
ptff = 0x0
init_left = 0
init_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}
init_cond = {__data = {__wseq = {__value64 = 31, __value32 = {__low = 31, __high = 0}}, __g1_start = {__value64 = 27, __value32 = {__low = 27, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 8, __wrefs = 0, __g_signals = {0, 0}}, __size = "\037\000\000\000\000\000\000\000\033", '\000' <repeats 23 times>, "\b", '\000' <repeats 14 times>, __align = 31}
#17 0x0000565519748329 in main (argc=3, argv=0x7fff1fe0d588) at src/haproxy.c:3645
err = 0
retry = 200
limit = {rlim_cur = 1048575, rlim_max = 1048576}
pidfd = -1
intovf = 4
(gdb)
Additional Information
No response
Hi,
Ok there's definitively a problem here. The two patches should address it, any chance you can test them ?
Thanks ! 0001-BUG-MEDIUM-lua-Get-the-arguments-for-running-sample_.txt 0002-BUG-MEDIUM-lua-handle-stick-table-implicit-arguments.txt
Wow, you guys are amazing! So far I've never waited more than 24 hours for a patch :smile:
This definitely fixes my local test setup.
Could you send me patches which apply to the latest 2.4? In this case I can do more extensive tests in our development environment, where we are currently running 2.4.18. It's much easier for me to get a patched 2.4 version running there.
In your example lua script, shouldn't txn.c:in_table() take two arguments? First one being input data and second one being the table where to search for the data?
Applying 0001-BUG-MEDIUM-lua-Get-the-arguments-for-running-sample_.txt in 2.4 series break this behavior (not accepting input argument anymore)
In your example lua script, shouldn't txn.c:in_table() take two arguments? First one being input data and second one being the table where to search for the data?
Yes, you are correct! The second argument was lost during the creation of the example it seems. But I can also reproduce the Segmentation Fault with two arguments (without the patch of course).
But I can also reproduce the Segmentation Fault with two arguments (without the patch of course).
No, sorry. I spoke too soon. With 2 arguments I cannot reproduce the crash.
Yes, so it was actually a bug in our Lua code where we had a call to in_table with one argument, which should have had two arguments. We did not notice this, because the code path was not in use anymore. We just noticed the segfault after the upgrade to 2.4, because there were still some requests coming in, which still triggered this code path to run, although it was obsolete.
This is great news! Regarding the crash, even if only one argument is provided to in_table, it should not cause the main process to crash. It seems that implicit arguments for sticky table is supported for a while now.
But for this to work, you need a table to be declared within the proxy that will call lua function upon http-req.
@cognet patch: 0002-BUG-MEDIUM-lua-handle-stick-table-implicit-arguments.txt seems to be required for this to work on Haproxy latest code because of stick table code change as he explained in the commit msg.
But if no table is declared within the proxy that triggers lua code, we still have a crash in sample_conv_in_table() (stick_table.c)
The following patch adds extra check in sample_conv_in_table() to prevent crash when in_table() is called without table argument and no table currently exists within the calling proxy: 0001-BUG-MEDIUM-stick-table-in_table_check_for_empty_sticktable.txt
I believe both 0002-BUG-MEDIUM-lua-handle-stick-table-implicit-arguments.txt from @cognet and 0001-BUG-MEDIUM-stick-table-in_table_check_for_empty_sticktable.txt are required to make this call safe without breaking existing behavior.
Yes, so it was actually a bug in our Lua code where we had a call to
in_tablewith one argument, which should have had two arguments. We did not notice this, because the code path was not in use anymore. We just noticed the segfault after the upgrade to 2.4, because there were still some requests coming in, which still triggered this code path to run, although it was obsolete.
Let me clarify my previous message, I was not totally right. Depending on your usage of in_table, it might not be a bug in your Lua code:
It seems that either calling in_table with explicit table name or without table name (in which case table is inherited from calling proxy, if there is one) are valid use cases.
But here we witness a regression only affecting the case where you call in_table with one argument. The proposed patches should fix the regression.
So you're left with two options:
- try to apply 0002-BUG-MEDIUM-lua-handle-stick-table-implicit-arguments.txt + 0001-BUG-MEDIUM-stick-table-in_table_check_for_empty_sticktable.txt and check if your Lua code works as expected in 2.4
- edit your Lua code to provide explicit table name in every call to in_table() (not very convenient, but it should work) while waiting for a definitive patch to be submitted and backported to 2.4.
Let me clarify my previous message, I was not totally right. Depending on your usage of in_table, it might not be a bug in your Lua code:
It seems that either calling in_table with explicit table name or without table name (in which case table is inherited from calling proxy, if there is one) are valid use cases.
Thanks for clarification! We've already solved our problem by simply removing the obsolete code :)
So we don't need the patch right now and everything works for us currently. Thank you!
Happy to learn that you solved it! :)
Please leave the issue opened: we'll wait for @cognet to come back so we can fix the bug for good.
Guys please, I got lost here, what's the status of this issue now ? Are patches still needed or not, and if so, which ones ?
I think those new, attached patches should be merged.
0001-BUG-MEDIUM-lua-Get-the-arguments-for-running-sample_.txt 0002-BUG-MEDIUM-lua-handle-stick-table-implicit-arguments.txt
Thank you for your update Olivier.
This 3rd patch is required to make 0002 handle the case where no table is declared within the proxy: 0003-BUG-MEDIUM-hlua-hlua_lua2arg_check-handle-NULL-stick.patch.txt
(Or maybe @cognet could amend his 0002 patch to make backporting easier, I'm fine with that)
Also it seems that 0001 commit message was not properly amended ("lua: Get the arguments for running sample_conv right." is not relevant anymore)
Oops nice catch. Yeah your patch makes sense, I'll merge it with mine.
Updated patches 0001-BUG-MEDIUM-lua-Don-t-crash-in-hlua_lua2arg_check-on-.txt 0002-BUG-MEDIUM-lua-handle-stick-table-implicit-arguments.txt
Thank you guys, now merged!