ChezScheme icon indicating copy to clipboard operation
ChezScheme copied to clipboard

Port portable bytecode from Racket fork

Open lambdadog opened this issue 1 year ago • 2 comments

The Racket fork has implemented a pb (portable bytecode) target which allows running Chez, albeit slowly, on even entirely unsupported systems.

Past the obvious benefits of Chez running anywhere, this allows for bootstrapping without the numerous bootfiles that are tracked in-repo currently (solving #203), which improves the packaging story for a number of OSes and architectures.

lambdadog avatar Jul 17 '22 23:07 lambdadog

@mflatt would you be able to point to what work would be needed for this?

lambdadog avatar Jul 17 '22 23:07 lambdadog

The two main pieces for pb in the Racket branch are "pb.ss", which is a backend like "x86_64.ss" or "arm32.ss", and "pb.c"/"pb.h", which is the interpreter loop that should drop easily into the kernel. The instruction layout is defined by "cmacro.ss" so that it's shared between "pb.ss" and "pb.c"/"pb.h".

One obstacle to using "pb.ss" from the Racket branch is that backends have a somewhat different interface to support locally unboxed flonum arithmetic and allocation of floating-point registers, and some code that was duplicated across backends has been moved info "cpnanopass.ss". I don't think it would be too difficult to adjust the backend, though. Naturally, some other files like "platform.h" and the makefiles need to be adjusted to glue everything together.

That much would support things like pb64l or pb32b, where the compile-time word size and endianness matches the host architecture. To support a single set of pb boot files that work everywhere, the biggest complication is endianness. Using a 64-bit word size everywhere works ok — running on a 32-bit machine will just have a lot of "words" that are half 0 — as long as the kernel implementation is changed to have more explicit casts between words and pointers. But trying to interpret as, say, little-endian on a big-endian machine creates all sorts of trouble. To address that problem, the Racket branch of Chez Scheme introduces an "unknown" compile-time endianness, introduces a few primitives and some FFI support around "native" versus "swapped", and makes some operations fall back to dynamic choice in the uncommon case where that's needed. Those changes are more pervasive that plugging in a new backend and an interpreter loop.

mflatt avatar Jul 18 '22 12:07 mflatt

@mflatt This seems resolved now. And I wonder if portable bytecode documented somewhere?

tisonkun avatar Nov 16 '23 09:11 tisonkun

Looks resolved to me, yes.

lambdadog avatar Nov 16 '23 12:11 lambdadog

@tisonkun There's some general documentation on pb at https://github.com/cisco/ChezScheme/blob/main/IMPLEMENTATION.md#portable-bytecode, but there's no documentation about pb instructions other than comments in the backend at https://github.com/cisco/ChezScheme/blob/main/s/pb.ss.

mflatt avatar Nov 16 '23 13:11 mflatt