raptorjit weirdness
I just made a fresh checkout of the branch from https://github.com/snabbco/snabb/pull/1264 but I am seeing segfaults in the test suite. I isolated it to something simple; drop this into a test.lua:
local ffi = require("ffi")
local C = ffi.C
ffi.cdef 'int open(const char *pathname, int flags);'
local function open_for_read(filename)
return C.open(filename, 0)
end
print('opening')
local fd = open_for_read("/proc/self/stat")
print('opened', fd)
If run via RaptorJIT, I get:
$ sudo lib/luajit/src/raptorjit test.lua
opening
opened 3
If run via snabb snsh however:
$ sudo src/snabb snsh test.lua
opening
snabb[13365]: segfault at 0x9d0e70 ip 0x9d0e70 sp 0x7ffc91c74c58 code 2 errno 0
Segmentation fault
The backtrace in GDB:
opening
Program received signal SIGSEGV, Segmentation fault.
0x00000000009d0e70 in open ()
(gdb) bt
#0 0x00000000009d0e70 in open ()
#1 0x0000000000423557 in lj_vm_ffi_call ()
#2 0x000000000046552a in lj_ccall_func (L=L@entry=0x7ffff7fcf378, cd=<optimized out>) at lj_ccall.c:416
#3 0x000000000045e9a3 in lj_cf_ffi_meta___call (L=0x7ffff7fcf378) at lib_ffi.c:229
#4 0x0000000000421125 in lj_BC_FUNCC ()
#5 0x0000000000455af8 in lj_cf_dofile (L=0x7ffff7fcf378) at lib_base.c:406
#6 0x0000000000421125 in lj_BC_FUNCC ()
#7 0x000000000045ad7a in lj_cf_package_require (L=0x7ffff7fcf378) at lib_package.c:322
#8 0x0000000000421125 in lj_BC_FUNCC ()
#9 0x000000000045ad7a in lj_cf_package_require (L=0x7ffff7fcf378) at lib_package.c:322
#10 0x0000000000421125 in lj_BC_FUNCC ()
#11 0x0000000000413ed8 in lua_pcall (L=<optimized out>, nargs=<optimized out>, nresults=<optimized out>, errfunc=<optimized out>) at lj_api.c:1074
#12 0x000000000040d063 in main ()
How do I debug this? :)
I get the same error if I put require('test.lua') at the top of core/startup.lua, so it's nothing that Snabb Lua code is doing.
Heh, I think the problem is https://github.com/snabbco/snabb/pull/1264/commits/c2c2ed2a9716636fa817376e876b09ab256242a8; these variables need to be static
Weird. I changed those file-local variables to be static (clearly they are meant to be static) but I still have the problem. So I built a custom binary thinking that maybe the way we were linking was causing some problem, but no:
wingo@sparrow ~/src/raptorjit-snabb$ cat test.c
/* Use of this source code is governed by the Apache 2.0 license; see COPYING. */
#include <stdio.h>
#include "lua.h"
#include "lualib.h"
#include "lauxlib.h"
#include <stdint.h>
int main(int argc, char **argv)
{
lua_State* L = luaL_newstate();
luaL_openlibs(L);
return luaL_dostring(L, "loadfile('./test.lua')()");
}
wingo@sparrow ~/src/raptorjit-snabb$ gcc -Ilib/luajit/src -Wl,--no-as-needed -Wl,-E -Werror -Wall -o test test.c lib/luajit/src/raptorjit.a -lrt -lc -ldl -lm -lpthread
wingo@sparrow ~/src/raptorjit-snabb$ ./test
opening
opened 3
Additionally the commit https://github.com/snabbco/snabb/commit/c2c2ed2a9716636fa817376e876b09ab256242a8 makes a second copy of lj_auditlog.c, I think by mistake (as it's in src/lj_auditlog.c, its location in raptorjit). That causes the second copy to be linked into snabb, and I didn't notice the second copy which also has the non-static variables. Fixing the static variable issue in raptorjit is good but not necessary; removing the double-included file does appear necessary.