cosmopolitan
cosmopolitan copied to clipboard
RedBean: LuaJIT support
https://luajit.org/ is known as the fastest implementation of Lua 5.1 on the planet. It also features a FFI interface that I found easier to use than any other, plus it's very fast, too.
Since RedBean is a really fast HTTP(S) server, it might be a good match.
My use case is HTTP API servers, with logic implemented in Lua, and calling out to external C libraries from Lua.
Currently I am using http://openresty.org, which features LuaJIT, but uses much-more-heavyweight nginx under the hood.
Moving to Redbean would be a lightweight alternative with code that a single developer could understand.
Openresty features Lua code caching, so require'ing a library only compiles code once. This is crucial for those external C libraries that have state and forbid duplicate initialization, which happens in my use case. Not sure this feature can be replicated in a fork-per-request server, may need finite-number preforking.
Openresty also allows to listen on unix domain sockets, which I also use for enhanced performance of API services co-residing on the same server.
Finally, I am aware that LuaJIT is architecture-specific, but for my use case I don't need architecture portability and could live with a Linux-only build that allows x86 LuaJIT (64bit).
Thoughts? Feasibility? How can we help?
It's pretty straightforward to compile/run LuaJIT with the Cosmopolitan amalgamation, I did it a while back for LuaJIT 2.1.0 beta3.
Just modify src/Makefile
to include cosmopolitan.h
/cosmopolitan.a
rules where necessary.
I haven't tried adding LuaJIT to redbean though.
@ahgamut, do you happen to have a diff
for the Makefile? I tried to compile it on Windows, but ran into several issues (including conflicts with the noinline
define).
I think the main issue with the redbean integration is going to be the ABI differences (Lua 5.1 for LuaJIT vs. Lua 5.4 for the current version in redbean).
@pkulchenko here you go: https://github.com/ahgamut/LuaJIT-cosmo
There were a few more changes than just the Makefile.
Compiles on Debian 10 Linux with GCC 8.3.0 + the contents of cosmopolitan.zip
.
Warning: I have not tested luajit.com
at all, I just tried some basic stuff before moving on to Python.
Regarding noinline
I think I should submit a PR because I've seen the same conflict multiple times now.
@ahgamut, I got it working with your code, but had to remove -ldl
, as it's not going to be available:
diff --git a/src/Makefile b/src/Makefile
index 9d291e8..ee39b0a 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -93,7 +93,7 @@ XCFLAGS=
# executable. But please consider that the FFI library is compiled-in,
# but NOT loaded by default. It only allocates any memory, if you actually
# make use of it.
-XCFLAGS+= -DLUAJIT_DISABLE_FFI
+#XCFLAGS+= -DLUAJIT_DISABLE_FFI
#
# Features from Lua 5.2 that are unlikely to break existing code are
# enabled by default. Some other features that *might* break some existing
@@ -128,7 +128,7 @@ XCFLAGS+= -DLUAJIT_ENABLE_GC64
# realloc usually doesn't return addresses in the right address range.
# OTOH this option is mandatory for Valgrind's memcheck tool on x64 and
# the only way to get useful results from it for all other architectures.
-#XCFLAGS+= -DLUAJIT_USE_SYSMALLOC
+XCFLAGS+= -DLUAJIT_USE_SYSMALLOC
#
# This define is required to run LuaJIT under Valgrind. The Valgrind
# header files must be installed. You should enable debug information, too.
@@ -352,10 +352,10 @@ else
endif
endif
ifeq (Linux,$(TARGET_SYS))
- TARGET_XLIBS+= -ldl
+ TARGET_XLIBS+=
endif
ifeq (GNU/kFreeBSD,$(TARGET_SYS))
- TARGET_XLIBS+= -ldl
+ TARGET_XLIBS+=
endif
endif
endif
I also enabled SYSMALLOC and FFI.
The only issue I ran into so far was that os.clock()
returned -1 for me on Windows (I compiled this under WSL2). Not sure yet what to do about it.
The only issue I ran into so far was that os.clock() returned -1 for me on Windows (I compiled this under WSL2).
Actually, it seems to be by design, as CLOCK_PROCESS_CPUTIME_ID
is -1 on Windows:
syscon clock CLOCK_PROCESS_CPUTIME_ID 2 -1 15 2 0x40000000 -1 #
@jart, since on Windows clock()
measures the wall time ("The clock function tells how much wall-clock time has passed since the CRT initialization during process start." from https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/clock), should we try to emulate it by using CLOCK_MONOTONIC_RAW
minus the same value saved during the process initialization?
@ahgamut, do you see any problems with errors thrown from LuaJIT with your build? For example, does luajit -e "error('oops')"
properly shows the error with a stack trace? In my case I don't get anything back (it looks like the app hangs).
@pkulchenko The crash is because I disabled the frame unwinding in the source. At the time LuaJIT didn't compile with external frame unwinding enabled.
https://github.com/ahgamut/LuaJIT-cosmo/commit/e7eece3052cf5f104ea5946567528fde9a221163#diff-dbe25b7494065b51ddde8d2bc71d3b7dae99294b8ecd9a5abf01ae2e4e81ff28L65
Good point; thanks @ahgamut. I'll check if we can compile it with the unwinding enabled.
According to the long comment on frame unwinding, the internal frame unwinding doesn't require any OS/library support, but for some reason doesn't work as expected.
The external unwinding requires _Unwind_GetCFA
and _Unwind_DeleteException
that we're missing. The first one is the most critical, as it allows LuaJIT to retrieve the frame address and the associated Lua state value, so it can work with the correct Lua state.
@ahgamut, I found a way to avoid the segfaults I was seeing, but it looks like some sort of heisenbug, as it appears to be crashing in err_unwind, but as soon as I add any fprintf
/printf
call there (for anything longer than an empty string), the issue disappears and the stack trace is produced as expected. I'm not sure what to make of it. @jart, any ideas?
I also figured out what wasn't working with FFI examples. If there is any function call, it uses dlsym
call to get the function by its name, which is not implemented in cosmopolitan. @jart, is there any chance we can make dlsym working for the functions in the executable itself? If not, I think the current functionality is fine, as it still works with the FFI data structured, just not function calls. When dl*
functions are implemented, it will make it more useful.
For example, does luajit -e "error('oops')" properly shows the error with a stack trace? In my case I don't get anything back (it looks like the app hangs).
@pkulchenko I get a stack trace for the simple example (cloned the repo, compiled on GCC 8.3.0, run on Debian 10):
@pkulchenko Yes we should polyfill CLOCK_PROCESS_CPUTIME_ID
. We're currently working around it manually in the Python implementation. But it should be part of LIBC_CALLS
so that all languages benefit from it.
https://github.com/jart/cosmopolitan/blob/48a26682394be11e7ab80dda892a9b505e48e6c9/third_party/python/Modules/timemodule.c#L980-L995
@jart, is there any chance we can make dlsym working for the functions in the executable itself? If not, I think the current functionality is fine, as it still works with the FFI data structured, just not function calls. When dl* functions are implemented, it will make it more useful.
We have GetSymbolTable()
but in order for it to work the .com.dbg
binary needs to be present, so availability of symbol information isn't guaranteed.
_Unwind_GetCFA
Is that the same thing as __builtin_frame_pointer(0)
? We use that in many files to unwind the stack, print backtraces, collect garbage, etc. In pretty much all modes. We also use -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer
in all build modes currently to guarantee the information is accurate. RBP always holds a valid frame pointer chain and no DWARF information is required.
@pkulchenko Yes we should polyfill CLOCK_PROCESS_CPUTIME_ID. We're currently working around it manually in the Python implementation. But it should be part of LIBC_CALLS so that all languages benefit from it.
Interesting; thank you for the hint. It looks like we need to polyfill it in the clock()
call, rather than clock_gettime()
, as the latter doesn't have the information that CLOCK_PROCESS_CPUTIME_ID
is being requested (as it only gets -1). I'll submit a PR for it.
We have GetSymbolTable() but in order for it to work the .com.dbg binary needs to be present, so availability of symbol information isn't guaranteed.
That may still be a good option for those who want to use FFI, especially if there is a way to extract it and package it separately just for that purpose (sort of COSMO_WITH_FFI
option; GetSymbolTable()
will check that info if .com.dbg
is not present). For some future extension...
Speaking of OpenResty, their branch of LuaJIT is actively maintained but still syncs with upstream; that might be more suitable: https://github.com/openresty/luajit2.
Just a note in support of the effort: Besides the eventual speed improvement, an option to build RedBean with LuaJIT would allow bundling and using in an app some existing "pure FFI implementations" (i.e. standard Lua + c-types manipulation utilizing the FFI LuaJIT module, but without calling-out to C libs).
An example of a such useful module is LPegLJ, a VM approach based (memory efficient) PEG parsing library, which is actually a superset of the functionality of it's prototype - the famous C LPeg library of Lua leader Roberto Ierusalimschy (e.g. supports left recursion, stream parsing and other goodies).
Some more "pure LuaJIT" modules are listed in the LuaJIT wiki
@ahgamut, @jart, what's the correct way to integrate LuaJIT into the cosmo build process? It has several specific steps in the makefile and I'm not sure how to configure and run them.
If someone can help me integrate LuaJIT into third_party build, I'll take care of making it work with Redbean.
@decalek, good point. There is also LuaJIT Language Toolkit (not listed on the LuaJIT wiki) that can be used for Lua-based DSL development (with compilation to LuaJIT bytecode).
I managed to fix the issue with the crash in stack unwind that I was seeing by setting LUAJIT_NO_UNWIND
. I also updated the makefile to add generation of .dbg to enable --ftrace
support (which helped me track down the issue).
Here is a complete patch against @ahgamut's repo:
diff --git a/src/Makefile b/src/Makefile
index 9d291e8..432151a 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -87,13 +87,13 @@ BUILDMODE= static
##############################################################################
# Enable/disable these features as needed, but make sure you force a full
# recompile with "make clean", followed by "make".
-XCFLAGS=
+XCFLAGS= -DLUAJIT_NO_UNWIND
#
# Permanently disable the FFI extension to reduce the size of the LuaJIT
# executable. But please consider that the FFI library is compiled-in,
# but NOT loaded by default. It only allocates any memory, if you actually
# make use of it.
-XCFLAGS+= -DLUAJIT_DISABLE_FFI
+#XCFLAGS+= -DLUAJIT_DISABLE_FFI
#
# Features from Lua 5.2 that are unlikely to break existing code are
# enabled by default. Some other features that *might* break some existing
@@ -128,7 +128,7 @@ XCFLAGS+= -DLUAJIT_ENABLE_GC64
# realloc usually doesn't return addresses in the right address range.
# OTOH this option is mandatory for Valgrind's memcheck tool on x64 and
# the only way to get useful results from it for all other architectures.
-#XCFLAGS+= -DLUAJIT_USE_SYSMALLOC
+XCFLAGS+= -DLUAJIT_USE_SYSMALLOC
#
# This define is required to run LuaJIT under Valgrind. The Valgrind
# header files must be installed. You should enable debug information, too.
@@ -190,7 +190,7 @@ endif
COSMO_LIBDIR=../libcosmo
COSMO_STUBDIR=../header_stubs
-COSMO_PREFLAGS= -static -nostdlib -nostdinc -fno-pie -mno-red-zone -fno-omit-frame-pointer -pg
+COSMO_PREFLAGS= -g -static -nostdlib -nostdinc -fno-pie -mno-red-zone -fno-omit-frame-pointer -pg -mnop-mcount
COSMO_CFLAGS= $(COSMO_PREFLAGS) -I$(COSMO_STUBDIR)/ -include $(COSMO_LIBDIR)/cosmopolitan.h
COSMO_POSTFLAGS= -fuse-ld=bfd -Wl,-T,$(COSMO_LIBDIR)/ape.lds
COSMO_FILES= $(COSMO_LIBDIR)/cosmopolitan.h $(COSMO_LIBDIR)/crt.o $(COSMO_LIBDIR)/ape.o $(COSMO_LIBDIR)/cosmopolitan.a
@@ -722,9 +722,9 @@ $(LUAJIT_SO): $(LJVMCORE_O)
$(LUAJIT_T): $(TARGET_O) $(LUAJIT_O) $(TARGET_DEP)
$(E) "LINK $@"
- $(Q)$(TARGET_LD) $(TARGET_ALDFLAGS) -o $@ $(LUAJIT_O) $(TARGET_O) $(TARGET_ALIBS)
- $(Q)$(TARGET_STRIP) $@
- objcopy -SO binary $@ ../[email protected]
+ $(Q)$(TARGET_LD) $(TARGET_ALDFLAGS) -o ../[email protected] $(LUAJIT_O) $(TARGET_O) $(TARGET_ALIBS)
+ objcopy -SO binary ../[email protected] ../[email protected]
+ $(Q)$(TARGET_STRIP) ../[email protected]
$(E) "OK Successfully built LuaJIT"
##############################################################################
diff --git a/src/lib_package.c b/src/lib_package.c
index 2199832..6fac43e 100644
--- a/src/lib_package.c
+++ b/src/lib_package.c
@@ -32,7 +32,7 @@
#define SYMPREFIX_CF "luaopen_%s"
#define SYMPREFIX_BC "luaJIT_BC_%s"
-#if 0 && LJ_TARGET_DLOPEN
+#if LJ_TARGET_DLOPEN
#include <dlfcn.h>
diff --git a/src/lj_arch.h b/src/lj_arch.h
index c8d7138..df8602f 100644
--- a/src/lj_arch.h
+++ b/src/lj_arch.h
@@ -105,7 +105,7 @@
#define LJ_TARGET_OSX (LUAJIT_OS == LUAJIT_OS_OSX)
#define LJ_TARGET_IOS (LJ_TARGET_OSX && (LUAJIT_TARGET == LUAJIT_ARCH_ARM || LUAJIT_TARGET == LUAJIT_ARCH_ARM64))
#define LJ_TARGET_POSIX (LUAJIT_OS > LUAJIT_OS_WINDOWS)
-#define LJ_TARGET_DLOPEN LJ_TARGET_POSIX
+#define LJ_TARGET_DLOPEN 0
#ifdef __CELLOS_LV2__
#define LJ_TARGET_PS3 1
diff --git a/src/lj_err.c b/src/lj_err.c
index 31621f0..b8ffee3 100644
--- a/src/lj_err.c
+++ b/src/lj_err.c
@@ -183,7 +183,7 @@ static void *err_unwind(lua_State *L, void *stopcf, int errcode)
/* -- External frame unwinding -------------------------------------------- */
-#if 0 && defined(__GNUC__) && !LJ_NO_UNWIND && !LJ_ABI_WIN
+#if defined(__GNUC__) && !LJ_NO_UNWIND && !LJ_ABI_WIN
/*
** We have to use our own definitions instead of the mandatory (!) unwind.h,
@pkulchenko the LuaJIT build first involves creating a "minilua" or requires an existing Lua (5.2?) implementation to be available. IIRC The minilua helps generate some files that are required for the eventual LuaJIT build.
One way to add LuaJIT to third_party
might be to check in the "minilua"-generated files along with the source and then put together a build config.
There is also an option to do an amalgamated build (make amalg
); from the LuaJIT docs: This compiles the LuaJIT core as one huge C file and allows GCC to generate faster and shorter code. Alas, this requires lots of memory during the build. This may be a problem for some users, that's why it's not enabled by default. But it shouldn't be a problem for most build farms. It's recommended that binary distributions use this target for their LuaJIT builds.
I tested it and it works with the proposed changes, so I'll integrate the amalgamated build.
@jart, any suggestions on how to select between Lua 5.4 and LuaJIT in building redbean? Or should be generate a separate executable for that?
I tried the amalgamated build for SQLite, and it was slower than building from the split files (see #161 vs #162, and related discussion).
@pkulchenko https://github.com/ahgamut/cosmopolitan/tree/thirdparty-luajit
the build may get interrupted because of a bug in luajit.mk
(unspecified prerequisite), just resume to get luajit.com
:
@jart is it possible to show the line number(s) for unspecified requisite
/other Makefile errors? I'm quite slow at debugging my incorrect recipes...
Updated https://github.com/ahgamut/cosmopolitan/tree/thirdparty-luajit to be closer to current HEAD commit
Hello, Would anyone have guidelines to embed LuaJIT within redbean as for now? I could compile LuaJIT with cosmopolitan, but don’t understand how to use it as a Lua replacement within redbean.
@jperon, I don't think it's going to be supported at the moment without redbean changes, as it's written against Lua 5.4 API, which is different in some respects from Lua 5.1 API that LuaJIT requires. It's definitely possible and I can possibly take a look at that, but I really want to get some other things added to redbean that take priority over this work.
If you want to give it a try, then look at net.mk and see if you can figure out how to replace THIRD_PARTY_LUA
with THIRD_PARTY_LUAJIT
(from @ahgamut's repository) and check what kind of compilation errors you get from redbean. I can help you with solving those.
@ahgamut, are we at the point of merging THIRD_PARTY_LUAJIT
into cosmopolitan?
First error I get: "Missing OS support for explicit placement of executable memory"
The error log is joined. I’m sorry I can’t do more by myself, as my C skills are nearly nothing.
@pkulchenko LuaJIT builds successfully on https://github.com/ahgamut/cosmopolitan/tree/thirdparty-luajit (synced to latest commit https://github.com/jart/cosmopolitan/commit/ee8a86163548befdf01e6f492612b8b0ec2c9a89).
It's been a while since I've built LuaJIT -- needed to update the dependencies. I think the LuaJIT source may need a bit of read-through to ensure it uses everything offered by Cosmopolitan Libc. I'd suggest holding off on merging until we can confirm LuaJIT builds and runs as expected.
With @ahgamut's last changes, I didn’t get the error above, but got further. Here is the error message (till the end):
4,604⏰ 2,073⏳ 256k 1,152iop build/bootstrap/fixupobj.com o//tool/net/redbean-unsecure.o
o//tool/net/net.pkg: undefined symbol 'luaL_buffinitsize' (o//tool/net/largon2.o) not defined by direct dependencies:
o//dsp/scale/scale.a.pkg
o//libc/calls/syscalls.a.pkg
o//libc/dns/dns.a.pkg
o//libc/fmt/fmt.a.pkg
o//libc/intrin/intrin.a.pkg
o//libc/log/log.a.pkg
o//libc/mem/mem.a.pkg
o//libc/nexgen32e/nexgen32e.a.pkg
o//libc/nt/iphlpapi.a.pkg
o//libc/nt/kernel32.a.pkg
o//libc/proc/proc.a.pkg
o//libc/runtime/runtime.a.pkg
o//libc/sock/sock.a.pkg
o//libc/stdio/stdio.a.pkg
o//libc/str/str.a.pkg
o//libc/sysv/sysv.a.pkg
o//libc/sysv/calls.a.pkg
o//libc/time/time.a.pkg
o//libc/thread/thread.a.pkg
o//libc/tinymath/tinymath.a.pkg
o//libc/x/x.a.pkg
o//net/finger/finger.a.pkg
o//net/http/http.a.pkg
o//net/https/https.a.pkg
o//third_party/argon2/argon2.a.pkg
o//third_party/compiler_rt/compiler_rt.a.pkg
o//third_party/gdtoa/gdtoa.a.pkg
o//third_party/getopt/getopt.a.pkg
o//third_party/linenoise/linenoise.a.pkg
o//third_party/luajit/luajit.a.pkg
o//third_party/lua/lunix.a.pkg
o//third_party/maxmind/maxmind.a.pkg
o//third_party/mbedtls/mbedtls.a.pkg
o//third_party/regex/regex.a.pkg
o//third_party/sqlite3/libsqlite3.a.pkg
o//third_party/zlib/zlib.a.pkg
o//tool/args/args.a.pkg
o//tool/build/lib/buildlib.a.pkg
o//tool/decode/lib/decodelib.a.pkg
o//third_party/double-conversion/libdouble-conversion.a.pkg
`make MODE= -j4 o//tool/net/net.pkg` exited with 1:
build/bootstrap/package.com -o o//tool/net/net.pkg -do//dsp/scale/scale.a.pkg -do//libc/calls/syscalls.a.pkg -do//libc/dns/dns.a.pkg -do//libc/fmt/fmt.a.pkg -do//libc/intrin/intrin.a.pkg -do//libc/log/log.a.pkg -do//libc/mem/mem.a.pkg -do//libc/nexgen32e/nexgen32e.a.pkg -do//libc/nt/iphlpapi.a.pkg -do//libc/nt/kernel32.a.pkg -do//libc/proc/proc.a.pkg -do//libc/runtime/runtime.a.pkg -do//libc/sock/sock.a.pkg -do//libc/stdio/stdio.a.pkg -do//libc/str/str.a.pkg -do//libc/sysv/sysv.a.pkg -do//libc/sysv/calls.a.pkg -do//libc/time/time.a.pkg -do//libc/thread/thread.a.pkg -do//libc/tinymath/tinymath.a.pkg -do//libc/x/x.a.pkg -do//net/finger/finger.a.pkg -do//net/http/http.a.pkg -do//net/https/https.a.pkg -do//third_party/argon2/argon2.a.pkg -do//third_party/compiler_rt/compiler_rt.a.pkg -do//third_party/gdtoa/gdtoa.a.pkg -do//third_party/getopt/getopt.a.pkg -do//third_party/linenoise/linenoise.a.pkg -do//third_party/luajit/luajit.a.pkg -do//third_party/lua/lunix.a.pkg -do//third_party/maxmind/maxmind.a.pkg -do//third_party/mbedtls/mbedtls.a.pkg -do//third_party/regex/regex.a.pkg -do//third_party/sqlite3/libsqlite3.a.pkg -do//third_party/zlib/zlib.a.pkg -do//tool/args/args.a.pkg -do//tool/build/lib/buildlib.a.pkg -do//tool/decode/lib/decodelib.a.pkg -do//third_party/double-conversion/libdouble-conversion.a.pkg @o/tmpo__tool_net_net.pkg.tmp.args
consumed 10,914µs wall time
ballooned to 3,828kb in size
needed 8,081us cpu (87% kernel)
caused 192 page faults (95% memcpy)
55 context switches (10% consensual)
performed 144 read and 0 write i/o operations
make: *** [build/rules.mk:76: o//tool/net/net.pkg] Error 1
o//tool/net/net.pkg: undefined symbol 'luaL_buffinitsize' (o//tool/net/largon2.o) not defined by direct dependencies:
Right; that's the example of those API incompatibilities I was talking about, as there is no luaL_buffinitsize
in Lua 5.1, so it needs to be modified to call luaL_buffinit and luaL_prepbuffsize. I'm sure there is more, but this is one example.