dnsdist segfaults when using a luafunction which doesn't exist
- [x] This is not a support question, I have read about opensource and will send support questions to the IRC channel, GitHub Discussions or the mailing list.
- [x] I have read and understood the 'out in the open' support policy
- Program: dnsdist (haven't tested if it happens in any of the others)
- Issue type: Bug report
Short description
When using a non-existing function in a LuaAction(), dnsdist segfaults
Environment
- Operating system: Linux
- Software version: 1.9.8
- Software source: PowerDNS repository
Steps to reproduce
- Load the following as
crash_config.conf:
function debug(dq)
local mystring = "world"
errorlog(string.format("Hello %s", mystring))
return DNSAction.None -- None, so process next rule
end
newServer({address="9.9.9.9"})
addAction(AllRule(), LuaAction(debug))
(Note the typo: errorlog() doesn't exist. errlog() does.)
- Run
dnsdist -C crash_config.conf - Run any dnsquery against it (eg:
dig github.com @localhost)
Expected behaviour
Not crash
Actual behaviour
Segfaults
I don't think we should crash on this kind of configuration error, and I can indeed reproduce the issue:
(gdb) bt
#0 lua_rawgeti (L=0x7ffff7f9e380, idx=<optimized out>, n=<optimized out>) at /usr/src/debug/luajit/LuaJIT-a4f56a459a588ae768801074b46ba0adcfb49eb1/src/lj_api.c:844
#1 0x000055555580a4c2 in LuaContext::callRaw (state=0x7ffff7f9e380, functionAndArguments=..., outArguments=2) at ext/luawrapper/include/LuaContext.hpp:1454
#2 0x00005555558aa5c4 in LuaContext::call<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, DNSQuestion*> (state=0x7ffff7f9e380, toCall=...) at ext/luawrapper/include/LuaContext.hpp:1413
#3 LuaContext::LuaFunctionCaller<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >(DNSQuestion*)>::operator() (this=0x555556008030, params#0=<optimized out>) at ext/luawrapper/include/LuaContext.hpp:1984
#4 std::__invoke_impl<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, LuaContext::LuaFunctionCaller<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >(DNSQuestion*)>&, DNSQuestion*> (__f=...) at /usr/include/c++/14.2.1/bits/invoke.h:61
#5 std::__invoke_r<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, LuaContext::LuaFunctionCaller<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >(DNSQuestion*)>&, DNSQuestion*> (__fn=...) at /usr/include/c++/14.2.1/bits/invoke.h:116
#6 std::_Function_handler<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >(DNSQuestion*), LuaContext::LuaFunctionCaller<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >(DNSQuestion*)> >::_M_invoke (__functor=..., __args#0=<optimized out>)
at /usr/include/c++/14.2.1/bits/std_function.h:291
#7 0x0000555555898bdc in std::function<std::tuple<int, boost::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >(DNSQuestion*)>::operator() (this=0x555556011108, __args#0=<optimized out>) at /usr/include/c++/14.2.1/bits/std_function.h:591
#8 LuaAction::operator() (this=0x555556011100, dnsquestion=0x7fffb7ffd210, ruleresult=0x7fffb7ffd020) at dnsdist-lua-actions.cc:516
#9 0x0000555555c35d3c in applyRulesToQuery (holders=..., dq=..., now=...) at dnsdist.cc:1213
#10 processQuery (dq=..., holders=..., selectedBackend=std::shared_ptr<DownstreamState> (empty) = {...}) at dnsdist.cc:1675
#11 0x0000555555c38414 in processUDPQuery (cs=..., holders=..., msgh=<optimized out>, remote=..., dest=..., query=std::vector of length 48, capacity 4368 = {...}, responsesVect=0x0, queuedResponses=0x0, respIOV=0x0, respCBuf=0x0) at dnsdist.cc:1834
#12 0x0000555555c39e76 in operator() (__closure=0x7fffb7ffd5d0, param=...) at dnsdist.cc:2157
#13 udpClientThread (states=...) at dnsdist.cc:2174
#14 0x0000555555c3b170 in std::__invoke_impl<void, void (*)(std::vector<ClientState*, std::allocator<ClientState*> >), std::vector<ClientState*, std::allocator<ClientState*> > > (__f=<optimized out>) at /usr/include/c++/14.2.1/bits/invoke.h:61
#15 std::__invoke<void (*)(std::vector<ClientState*, std::allocator<ClientState*> >), std::vector<ClientState*, std::allocator<ClientState*> > > (__fn=<optimized out>) at /usr/include/c++/14.2.1/bits/invoke.h:96
#16 std::thread::_Invoker<std::tuple<void (*)(std::vector<ClientState*, std::allocator<ClientState*> >), std::vector<ClientState*, std::allocator<ClientState*> > > >::_M_invoke<0ul, 1ul> (this=<optimized out>) at /usr/include/c++/14.2.1/bits/std_thread.h:301
#17 std::thread::_Invoker<std::tuple<void (*)(std::vector<ClientState*, std::allocator<ClientState*> >), std::vector<ClientState*, std::allocator<ClientState*> > > >::operator() (this=<optimized out>) at /usr/include/c++/14.2.1/bits/std_thread.h:308
#18 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(std::vector<ClientState*, std::allocator<ClientState*> >), std::vector<ClientState*, std::allocator<ClientState*> > > > >::_M_run (this=<optimized out>) at /usr/include/c++/14.2.1/bits/std_thread.h:253
#19 0x00007ffff64e1f74 in std::execute_native_thread_routine (__p=0x5555560172b0) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104
#20 0x00007ffff62a370a in start_thread (arg=<optimized out>) at pthread_create.c:448
#21 0x00007ffff6327aac in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
I'm unsure whether I can actually fix it from DNSdist, though, but I'll look into it.
We crash in LuaWrapper indeed, while trying to access the traceback:
// stack top: {error, traceback}
lua_rawgeti(state, -1, 1); // stack top: {error, traceback}, er\
ror
Note to others (and self): the issue is that function is called debug(). debug should not be used, as this masks the debug module, which handles typical (harmless) errors.
Besides checking if there's a better way to handle this situation, we should document a list of "reserved" functions / modules that should not be used in Lua code
I'm unsure whether I can actually fix it from DNSdist, though, but I'll look into it.
after discussion on IRC (thanks ottom and stbuehler):
- just above the
lua_rawgetiwe should be checking for explicit values ofpCallReturnValueinstead of a blanket!= 0. In Frank's case, the value wasLUA_ERRRER(5) which does not promise a present traceback. - in
gettracebackwe should be using adebug.tracebackfunction that we collected earlier, before the user perhaps replaced thedebugglobal
Frank also suggests we document the list of words users can't use for themselves. When we fix (2), that should only be the global functions that users can define that we then call, I think.
Fixed by https://github.com/PowerDNS/pdns/pull/16230