prolog on the browser
Prolog is an ideal candidate for something ike ClojureScript - a Prolog that runs on the browser.
Such a system could make javascript disappear as a language - something we'd all applaud, I think.
There have been some hobby project level attempts to build Prolog in javascript, but a recent development, WebAssembly, means we could deploy to a sane VM.
WebAssembly
https://medium.com/javascript-scene/what-is-webassembly-the-dawn-of-a-new-era-61256ec5a8f6#.facwrlc6p
https://brendaneich.com/2015/06/from-asm-js-to-webassembly/
https://www.w3.org/community/webassembly/
Is this something (and I'm asking, not advocating) we could get ahead on, and have something head and shoulders better than anyone else ready when the browsers get there?
As part of my Masters Thesis I implemented both a JIT and AOT compiler for Constraint Handling Rules in JavaScript. Have a look at http://chrjs.net to get an insight. (Related repositories)
I am really interested in a way to run Prolog in JavaScript, both in the browser and node.js - mainly because I completely disagree on Anne's intention:
Such a system could make javascript disappear as a language - something we'd all applaud, I think.
I love both Prolog and JavaScript and see many advantages in use on in another. But motivations are different :)
I would declare the following targets when implementing Prolog in JavaScript:
-
Simply have a SWI-Prolog equivalent for the browser, i.e. translate existing SWI-Prolog programs into JavaScript
-
Provide a SWI-Prolog-like REPL:
> jspl file.pl ?- member(X,[1,2]). X = 1 ; X = 2 ; false. ?- halt. -
Allow the usage of JavaScript in Prolog
.plfiles -
Allow the usage of Prolog in JavaScript
.jsfiles, for example using tagged template strings
Recently I stumbled upon the article Solving riddles with Prolog and ES6 generators which really motivates me to have a deeper look into some Prolog-to-JavaScript transpilation.
I have recently experimented with compiling SWI to WebAssembly. I managed to compile a core of a relatively old version (5.6). There are 2 issues that prevent it:
- Generation of boot.prc. This requires execution of swipl on the build machine which is not possible. At this step the object code is LLVM IR and not executable on x86.
- Generation of atom table. This uses again host-dependent executable.
Atom table is generated using a script in older versions and I was able to compile 5.6 but because of missing boot.prc I only get error: Could not find system resources. I do not know yet which other runtime issues it might have. There are many ways to integrate SWI with JS and this is not my top priority.
This is related to #34 which is also made difficult due to cross-compilation issues.
I'd go for a new version. As for the problems, they are the same as for iOS. To deal with the atom table, the Makefile/configure configures two C compilers: one that generates native code and one that may cross-compile. The first is used for the helper tools, the latter for the real target. At some point this was moved from shell script to C to simplify porting ...
The simplest way to create boot.prc is to copy it from a locally installed native machine. The only requirements is that the word size (32/64 bit) matches. Alternatively you'd need something that can run WebAssembly on the host. I would assume that exists or will exist soon. Perhaps we should allow the system to start from the Prolog source in boot or generate the state lazily from source if the state does not exist. Both should be feasible.
I welcome this very much. I'm glad to help with advice, little things and figuring out a way to get the changes into the main source repo.
@JanWielemaker I'm still experimenting with 5.6 as I can build it without any patching. Conf tools and multiple compilers is still lots of magic for me. Anyway, I have found out:
- WebAssembly is essentially a 32bit platform. Getting a 32-bit boot prc file requires 32-bit native SWI build which is not that easy (another cross-compilation target on a 64-bit machine).
- Node.js can run WebAssembly on command line. The Emscripten compiler provides a loader/wrapper that makes usual C-style IO available to the compiled apps.
- There is no signal support in WebAssembly.
I was able to compile the .prc file through Node.js-wrapped SWI WebAssembly binary using the boot directory with the -b option. However, later it does not accept the file, upon normal start (without the -b option) it rejects the file as invalid: [FATAL ERROR: Not a SWI-Prolog saved state].
I have verified with strace that it opens the .prc file I generated through the same thing. The .prc file looks like it contains something, it's not empty. Hex editor shows <ARCHIVE> header at the beginning and <FOOT> footer at the end.
Signals are required for threading support? I disabled threading in the configure options, full command:
emconfigure ./configure --disable-mt --disable-readline --disable-gmp --disable-mapped-stacks --disable-custom-flags
I also edited pl.sh to use previously built native SWI. This generates 64-bit .prc as said earlier but at least allows to finish the build.
Starting without a .prc file sounds an option but for browsers we would still need some compact way to make those files available to the binary.
Still, going for an old system is not a very good idea. Current versions rely far less on non-portable features and have lots of stuff fixed. They also have improved cross-compilation support as the Windows version is cross-compiled. See README.mingw. Configure sets @CC@ for compiling and @CC_FOR_BUILD@ for building tools. It also sets @EXEEXT_FOR_BUILD@ for running Prolog in the build environment. In this case running it using Node.js seems the right way to go.
I'd assume that a couple of additions in configure.in should suffice to get this all working.
Signals were indeed used for threading. Current version doesn't need signals for threading. It does
of course need the pthread API.
Why the state is wrong is a harder question. That probably requires comparing the output of the boot compilation with a working version and/or look in pl-wic.c to see what triggers this message.
I have done some debugging for state loading. I got a 32-bit native state file for the same version and compared it with the wasm-based one. I noticed some minor differences but I think the issue is not in the file.
The file seems to be opened successfully and read-in wholly during rc_open_archive but subsequent rc_read calls fail to read anything (end of stream reached).
AFAIK, the old implementation uses memory mapping if it can. Possibly the backup scenario using ordinary I/O fails as it has not been tested for ages? It surely does require repositioning in the stream. How does web assembly handle additional files? Can it perform memory mapping? Note that 7.7 uses a completely different resource format based on libz. One of the nice things is that you can also add the library to the resource file. This too relies on memory mapping though. In recent versions you can also add the resource file as a string to the executable. This both creates a single file executable and doesn't require memory mapping. Would require some tweaking of the build process though.
There is no concept of filesystem in WebAssembly. The memory is just a linear space. There is no memory mapping (https://github.com/WebAssembly/design/blob/master/FAQ.md#what-about-mmap). It is possible to bind external functions into WebAssembly and that's only way to communicate with things outside. That's how fread and other POSIX functions have been supplied by Emscripten compiler.
Here is more detailed description: https://github.com/WebAssembly/design/blob/master/Semantics.md
I will look into compiling newer a version once I figure out how to effectively tweak the configure file.
As a note, 7.7.14 does not allow to build with --disable-mt (without threads) as missing mutex support in pl-zip.c gives compilation error:
In file included from pl-zip.c:40:0:
pl-zip.h:77:3: error: unknown type name 'simpleMutex'
simpleMutex lock; /* basic lock */
Another note, the configure script assumes that compiler creates certain types of object files, and greps it to detect endianness:
configure: error:
Unknown float word ordering. You need to manually preset
ax_cv_c_float_words_bigendian=no (or yes) according to your system.
Configure script: https://github.com/SWI-Prolog/swipl-devel/blob/9fc9a97b6e0e6d4dcaf25a77602d23d89a1c9016/src/ac/ax_c_float_words_bigendian.m4
I have temporarily removed the endianness check by removing code from the m4 file and replacing it with ax_cv_c_float_words_bigendian=no (WebAssembly is little-endian).
--disable-mt configuration argument will also hide definition of acquire_def2:
In file included from pl-wam.c:226:
./pl-index.c:1869:2: error: implicit declaration of function 'acquire_def2' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
acquire_def2(def, old);
Trying without --disable-mt results in another error (as __linux__ preprocessor def is not set):
pl-thread.c:5986:34: error: no member named 'pid' in 'struct _PL_thread_info_t'; did you mean 'tid'?
if ( (e=get_procps_entry(info->pid)) )
I also have some issues with modules in subdirectories. For example, object code for files in src/os/ and src/minizip/ is put directly under src/ but the linker is not expecting it.
I just pushed a number of patches that makes the current git version (swipl-devel.git) compile and pass all tests with --disable-mt. That fixes your acquire_def2 issue and a few more. Please use the git version, so we can stay in sync. To do so:
git clone https://github.com/SWI-Prolog/swipl-devel.git
cd swipl-devel
./prepare
Say no to cloning the modules (yes is ok too, but it just takes long) and choice option 1 for the docs.
Will have a look at the rest this afternoo, but I have an appointment in 15 min. Probably won't make it before ...
P.s. do I understand webassembly does implement threads?
WebAssembly has no threads on VM level. The Emscripten compiler provides pthread implementation using WebWorkers and shared memory. Shared memory is disabled in browsers for now (Spectre mitigation) which means no threads at the moment but I'm sure that one day they come back. More information: https://kripken.github.io/emscripten-site/docs/porting/pthreads.html
Thanks! The git version resolves lots of issues above but at the moment I'm still stuck with wrong object file locations:
/emsdk/emscripten/1.38.3/emcc -shared -o ../lib/x86_64-linux/libswipl.so.7.7.14 -Wl,-soname=libswipl.so.7.7 \
pl-atom.o pl-wam.o pl-arith.o pl-bag.o pl-error.o pl-comp.o pl-zip.o pl-dwim.o pl-ext.o pl-flag.o pl-funct.o pl-gc.o pl-privitf.o pl-list.o pl-string.o pl-load.o pl-modul.o pl-op.o pl-prims.o pl-pro.o pl-proc.o pl-prof.o pl-read.o pl-rec.o pl-setup.o pl-sys.o pl-trace.o pl-util.o pl-wic.o pl-write.o pl-term.o pl-thread.o pl-xterm.o pl-srcfile.o pl-beos.o pl-attvar.o pl-gvar.o pl-btree.o pl-init.o pl-gmp.o pl-segstack.o pl-hash.o pl-version.o pl-codetable.o pl-supervisor.o pl-dbref.o pl-termhash.o pl-variant.o pl-assert.o pl-copyterm.o pl-debug.o pl-cont.o pl-ressymbol.o pl-dict.o pl-trie.o pl-indirect.o pl-tabling.o pl-rsort.o pl-mutex.o minizip/zip.o minizip/unzip.o minizip/ioapi.o os/pl-buffer.o os/pl-ctype.o os/pl-file.o os/pl-files.o os/pl-glob.o os/pl-os.o os/pl-stream.o os/pl-string.o os/pl-table.o os/pl-text.o os/pl-utf8.o os/pl-fmt.o os/pl-dtoa.o os/pl-option.o os/pl-cstack.o os/pl-codelist.o os/pl-prologflag.o os/pl-tai.o os/pl-locale.o libtai/caltime_utc.o libtai/caltime_tai.o libtai/leapsecs_sub.o libtai/leapsecs_add.o libtai/caldate_fmjd.o libtai/caldate_mjd.o libtai/leapsecs_init.o libtai/leapsecs_read.o libtai/tai_pack.o libtai/tai_unpack.o -L/zlib-1.2.11 -Wl,-rpath=/usr/local/lib/swipl-7.7.14/lib/x86_64-linux -lncursesw -lm -lz -lz
ERROR:root:minizip/zip.o: No such file or directory ("minizip/zip.o" was expected to be an input file, based on the commandline arguments provided)
A compiler command (no output argument):
/emsdk/emscripten/1.38.3/emcc -c -I. -I. -I/zlib-1.2.11 -fPIC os/pl-buffer.c
This is for the native version (-o arguments are present here):
gcc -c -I. -I. -g -O2 -pthread -fPIC os/pl-option.c -o os/pl-option.o
I do not know where such discrepancy comes between generated Makefiles.
Seems an issue with the Makefiles or configure not picking up things properly. Do you have a simple recipe to get me where you are now? It is probably easier for me to figure it out myself than to send zillions of log files and edited files around trying to guess what is wrong ...
One of my dev machines runs Ubuntu 18.04 and there I have an emscripten package.
The steps to get there (assuming the last git version, adjust paths):
- Loading emscripten env:
source /emsdk/emsdk_env.sh - Building zlib wasm version (adjust paths):
cd zlib-1.2.11emconfigure ./configureemmake make
zlib build result will just reside in this directory, there is no install step. Important files should be zlib.h and libz.so (latter is LLVM IR bitcode).
Build SWI wasm version (asjust zlib path):
- Go to
srcdirectory. - Replace code in
ac/ax_c_float_words_bigendian.m4contents withax_cv_c_float_words_bigendian=no. - Rebuild configure script with the
autconfcommand. - Configure:
LDFLAGS=-L/zlib-1.2.11 CPPFLAGS=-I/zlib-1.2.11 emconfigure ./configure --disable-gmp --disable-custom-flags --disable-mt - Make (try 1 - will fail at defatom execution):
emmake make - Copy working
defatomfrom native build:
cp /swipl-7.7.14-native/src/defatom defatomtouch defatomchmod +x defatom
- Make (try 2 - will fail at the
libswipllinking step):emmake make - Somehow fix object file locations?
- Linking
swiplshould link bothlibswiplandlibzinto it.
The result of last step is still LLVM IR bitcode.
- Boot file generation?
This will likely require native swipl binary copied from other location or some other magic. Once we have steps 11 and 12 executed, we should have enough code to generate the boot file by running the wasm binary through Node.js. It did work like that with SWI 5.6.
- (Not yet tried) Add
.bcextension:mv swipl swipl.bc - (Not yet tried) Compile LLVM IR to WebAssembly:
emcc swipl.bc -s NODERAWFS=1 -o swipl.html
This last command produces files swipl.html, swipl.wasm and swipl.js. HTML file is for execution in browser, WASM file contains the actual code and the JS file is a wrapper to execute the wasm blob in browser or node. I have not yet tried to launch it with browser yet but I used this command with 5.6 to generate the boot file: node pl.js --nosignals -o pl.prc -b boot/init.pl. It shows the core itself works but it cannot do much useful without loading the same boot file.
I looked that Ubuntu 18.04 contains old Emscripten package. It might be a better idea to install it using the official instructions: https://kripken.github.io/emscripten-site/docs/getting_started/downloads.html
It is self-contained and installs into a single directory without spilling its guts all sround the system and possibly screwing up the system compilers.
Thanks. I'll give it a try.
Fixed one. It seems emcc doesn't do the default output location for -c correct. After configure, edit Makefile, around line 198 change to (added -o $@)
.c.o:
$(CC) -o $@ -c -I. -I$(srcdir) $(CFLAGS) $<
Now we get a binary :) Next step is running it ...
Need to add -lz to the final link command. Normally -lz is a dependency of libswipl.so, but that seems to be ignored. Now get a boot file using
node swipl.js --nosignals -o swipl.prc -b ../boot/init.pl
Running the result fails:
hppc823 (wasm; master) 75_> node swipl.js --nosignals -x swipl.prc
[FATAL ERROR: at Thu May 31 16:36:05 2018
Could not open resource database "swipl.prc": No such device]
exit(2) called, but NO_EXIT_RUNTIME is set, so halting execution but not exiting the runtime or preventing further async execution (build with NO_EXIT_RUNTIME=0, if you want a true shutdown)
One of the pittfals is that some functions are present, but stubs. E.g., it claims to have mmap() at configure time, but you told me it doesn't do memory mapping. I have some experience dealing with boot issues ....
Hmmm. In recent versions swipl.prc is a zip file. On the native version unzip -t works fine, but on the version generated with this version we find that the zip file is invalid. This may indicate libz or the minizip wrapper is not compiled correctly,
I got it building on my system too.
I'm actually surprised it gets this far. WebAssembly and Emscripten compiler are still very experimental. It's hard to give definite answer about mmap. Seems like some of it is there indeed. Needs more research.
I compiled minigzip (zlib's test app) to WebAssembly and this passes:
/zlib-1.2.11# echo hello world | node minigzip.js | node minigzip.js -d
There could be more things wrong, lots of moving parts with this setup.
Good to know. Might be an issue configuring minizip code that is linked into Prolog. Now documenting my steps and compiling debugging into Prolog (add -DO_DEBUG to COFLAGS).
The fact that it manages the boot compilation is promising: it is already running quite a bit of Prolog to do that!
Tried the swipl.prc from the 32-bit windows version, but that is not the first problem: same errors.
The first problem is indeed that it tries to mmap() the resource file, which doesn't work. I'll have to reactivate some dead code for that that was there before it was migrated to memory mapping ...
I compiled minigzip (zlib's test app) to WebAssembly and this passes:
That is not what is used though. This is the compressor. Good that that works. What is used in Prolog is in contrib/minizip. Compiling and testing that results in similar errors so I fear this needs a little debugging ...
There is indeed a bug inemscripten fseek(). Tested using:
#include <stdio.h>
int
main(int argc, char **argv)
{ FILE *f = fopen("test.data", "wb");
char data[] = "0123456789abcdef";
char data2[] = "0123456789ABCDEF";
char zero[16] = {0};
fwrite(data, 1, 16, f);
fwrite(zero, 1, 16, f);
fwrite(data, 1, 16, f);
fseek(f, SEEK_SET, 16);
fwrite(data2, 1, 16, f);
fclose(f);
return 0;
}
Run using
emcc -o t.bc t.c
emcc t.bc -s NODERAWFS=1 -o t.html
node t.js
od -c test.data
0000000 0 1 2 3 4 5 6 7 8 9 a b c d e f
0000020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 0 1 2 3 4 5 6 7 8 9 a b c d e f
0000060 0 1 2 3 4 5 6 7 8 9 A B C D E F
The zero block should have been written with data2, but instead, data2 is at the end.
Versions (on Ubuntu 18.04):
- node v8.9.1
- emcc 1.38.4
Thanks! This will likely also explain missing fseek behavior with 5.6. I'm going to try to make Emscripten fseek behave correctly.
Yip. Both minizip and the old 5.6 resource management library use fseek(). The latter only if mmap() is not provided. That also holds for the current SWI-Prolog resource manager. We can hack around though in several ways:
-
It seems repositioning at the block device API level (open/read/write/lseek) is fine. This should imply that SWI-Prolog's own stream I/O functions should work fine. We can make the minizip embedding using these (the functions used are determined by struct holding function pointers).
-
Embedding the resources in a string, which is what we eventually want, should work fine as all the repositioning for that is done by Prolog's I/O. This only leaves creating the resource DB. We can hack around that, but it is not great.
Seems you suggest you can get this fixed in Emscripten? How quickly would this be distributed?
It seems repositioning at the block device API level (open/read/write/lseek) is fine. This should imply that SWI-Prolog's own stream I/O functions should work fine. We can make the minizip embedding using these (the functions used are determined by struct holding function pointers).
Updated bug report. This was already used. I mixed up the test case. lseek()/read()/write() is the thing that is broken. The second hack around gets really dirty, so I propose to see see whether Emscripten can be fixed.