clasp icon indicating copy to clipboard operation
clasp copied to clipboard

bytecode-to-BIR incompatibilities

Open Bike opened this issue 1 year ago • 0 comments

The following needs to be fixed/done to make BTB compilation fully transparent, in the sense of a bytecoded f and its compiled (compile nil f) having the same semantics:

  • [x] load-time-value identify is preserved
  • [x] closure ordering is identical
  • [ ] NLX from compiled code into bytecode

Currently auto-BTB-compilation is only done for globally named functions, which excludes almost all closures. Therefore only the load-time-value thing can really come into play. But it would be good to make BTB really bulletproof by doing all of it.

To fix:

  • load-time-value forms that aren't read-only-p need to be indicated in bytecode somehow, I guess with an annotation (probably spanning the whole module, so that's kind of dumb, but oh well). Then BTB needs to notice that annotation and make a cleavir-bir:load-time-value-reference instead of a constant reference. Currently we always make a constant reference, so type inference applies and we miscompile. It may also be worth noting in the same area that the way we get names for function/variable cells is mildly awkward.
  • When we compile a closure, we get a new simple fun but keep the closure object, so we need the simple fun to use the closure indexing previously laid out by the bytecode compiler. clasp-cleavir lays out closures by using cleavir-set:set-to-list which has nondeterministic order when we use hash sets, and in any case will be an arbitrary order. Probably we need a dynamic variable or whatever to let us impose an order. Cleavir may also sometimes be able to eliminate a closure variable, and in this case we need to skip over that index (and possibly clear the data from the closure to prevent a memory leak?). We also need Cleavir to never introduce new closure variables, but I don't think it does now or that it ever will need to.
  • This one's pretty tricky and I think we'd need some kind of special handling, but it's not as impossible as it might sound. We use setjmp/longjmp and pass multiple values through the thread-local MV for both bytecode and native, so as long as the jmp_buf is the right one it should go okay. The only thing we'd need to do is set the vm._pc to the exit point, from the compiled code. That's new but I don't see any reason why we couldn't. We could provide a cc_sjlj_bytecode_unwind or something that gets the exit point (which is a constant) as an extra argument.

I don't think any of this requires Cleavir changes, so that's nice.

Bike avatar Jan 09 '24 17:01 Bike