rhombus-prototype
rhombus-prototype copied to clipboard
Establish boundary between rhombus/racket code and platform code
In order to make implementing Racket/Rhombus in other environments such as different compilers, WASM, JS, etc., all platform (i.e. calling FFI or Chez Scheme builtins) code should be explicitly called out. This should make implementing a Racket runtime in different environments easier since it would then only be required to implement these builtins plus support for full expanded Racket/Rhombus programs.
Additionally, once this is setup, adding in additional builtins in the core language should be discouraged and reusing current builtins/implementing in pure Rhombus should be highly encouraged.
Isn't this already the case today? The linklet layer acts as such a boundary. My understanding is that different implementations of the Racket VM (such as Racket CS and Pycket) only implement the stuff below the linklet layer, and everything above that layer (including the macro system) is regular Racket code that bootstraps itself on top of the linklet layer.
From what I understand, it is not well defined what is native code for linklets and what is pure Racket. For example, last time I checked hash tables were implemented in C for mainline Racket. This sort of thing was probably necessary for performance, but the boundary between what is written in Racket and what is written in the VM as a primitive or C extension should be carefully documented and tracked.
If hash tables are implemented in C, this is an important detail for someone trying to write a new runtime for Racket.
From what I understand, it is not well defined what is native code for linklets and what is pure Racket.
I think this depends on what you mean by "well defined." With the caveat that I don't actually work at this level, my understanding is that no shadowing is allowed in linklets, and and identifier is either:
- One of the primitive syntactic forms documented here, like
lambda; - A variable imported from another linklet;
- A variable defined in this linklet;
- A local variable (e.g. bound by
letrec-values); or - A reference to a primitive.
So, if you have a linklet, any identifier for which there isn't an immediately-visible definition is a primitive implemented at the VM level. It is unambiguous whether an identifier refers to refers to a primitive or not, which is one sense of "well defined."
On the other hand, you're right that there isn't a documented, stable list of all the primitives, which would be another useful sense of "well defined." Flatt et al. describes the set of primitives as "large": about 1500.
I think question, then, is whether it would be better, as @slaymaker1907 seems to suggest, to reduce the number of primitives, document what they are, and adopt a policy that "adding in additional builtins in the core language should be discouraged."
Mostly, I think this is an interesting question but ultimately a separate one from the design of a Rhombus language. I'm going to write a few thoughts about it anyway.
On the one hand, I think the benefits of having a small, stable, well-defined set of primitives are clear, clear enough that I'm not going to say much more about them. On the other hand, I think there would also be costs, and I'm not persuaded that the benefits would justify the costs.
If hash tables are implemented in C, this is an important detail for someone trying to write a new runtime for Racket.
Even with the current state of affairs, I'm not convinced that the above sentence is true. To take regular expressions as an example, today, the legacy Racket VM implements them in C, but Racket on Chez implements them in pure Racket. As part of bootstrapping, the Racket implementation of regular expressions is compiled, and its exports become "primitive" from the perspective of normal Racket programs. In general, a Racket implementation seems to be free to implement linklet-level primitives however it wants. On the other hand, restricting the set of primitives seems like it might effectively ban a Racket implementation from using an optimized representation for speed, memory efficiency, interoperability with the host platform, or some other desirable feature.
I also think, in most circumstances, it would be better to implement a new backend for an existing Racket implementation than to create a new implementation of the Racket VM. To take WASM for an example I know at least a little about, the most likely approach would probably be to implement a WASM backend for the Chez Scheme compiler, with Racket hopefully coming along mostly for free. (There are some potential pitfalls and things it might be worth waiting for WASM to implement, like tail calls and host GC. I think I remember @mflatt having posted something about how this might work, but I can't find it.)