ir Question about copy-and-patch technique

Recently, Python core developers presented an experimental JIT compiler in the 3.13 version. The work is inspired by Deegen, the next generation JIT compiler for Lua implemented in the scope of the Lua Remake project. The features that are different to DynAsm are:

No need to have any assembly knowledge.
No need to manually engineer the JIT.
No need to manually keep the JIT updated with new language features.

The "Copy-and-Patch Compilation" paper's authors showed very low startup time overhead and impressive compiled code performance compared with other popular JITs.

Is there a possibility that PHP's JIT can benefit from the copy-and-patch technique?

Sep 30 '24 18:09 dbalabka

The IR framework is similar to V8 Turbofan. It should produce code of similar quality but do this few times faster.
PHP uses tracing JIT compiler that performs register allocation and optimizations between code of different opcode handlers. A simple template technique can't do this.
PHP VM interpreter is much faster then CPython. A simple JIT may make slowdown instead of speedup, because of JIT code bloat.

The "Copy-and-Patch" technique is going to be a step backward. However, if someone creates a PoC, I would be glad to review it and may change my mind.

Oct 01 '24 07:10 dstogov

@dstogov, thanks a lot for your quick answer. Everything makes sense.

The IR framework is similar to V8 Turbofan. It should produce code of similar quality but do this a few times faster.

It would be interesting to compare IR with other JIT compilers; however, I don't see how it can be implemented easily. According to the "Copy-and-Patch Compilation" paper, authors implemented WebAssembly compiler WasmNow using PochiVM. Therefore, it is required to implement WebASsembly compilation using the IR framework to perform a comparison with V8 Turbofan and other JITs. I believe that IR can outperform other compilers.

PHP uses tracing JIT compiler that performs register allocation and optimizations between code of different opcode handlers. A simple template technique can't do this.

Thanks for the clarification.

PHP VM interpreter is much faster than CPython. A simple JIT may make slowdown instead of speedup because of JIT code bloat.

Agree that CPython can benefit from the simple JIT more than PHP, significantly, in data processing and ML use cases.

The "Copy-and-Patch" technique is going to be a step backward. However, if someone creates a PoC, I would be glad to review it and may change my mind.

In my opinion, investing our precious time into IR benchmarking that I mentioned above would be much more valuable. It can show the strong and weak sides of the IR framework's approach.

Oct 01 '24 08:10 dbalabka

It would be interesting to compare IR with other JIT compilers; however, I don't see how it can be implemented easily. According to the "Copy-and-Patch Compilation" paper, authors implemented WebAssembly compiler WasmNow using PochiVM. Therefore, it is required to implement WebASsembly compilation using the IR framework to perform a comparison with V8 Turbofan and other JITs. I believe that IR can outperform other compilers.

It's not possible to make an apple-to-apple comparison between JIT engines for different languages with different inputs. IR may load and execute LLVM files (no all features of LLVM are supported) so at least it may be compared with C compilers on some benchmarks. E.g. you may compile minilua.c with CLANG and then execute it with IR.

clang -Wno-everything -O -Xclang -disable-llvm-passes -w -c -emit-llvm -o tmp.bc minilua.c && opt tmp.bc --passes='function(mem2reg)' -S -o tmp.ll
ir --llvm-asm tmp.ll --run <minilua-args>

As "Copy-and-Patch" uses LLVM internally, it should be possible to use IR as a back-end. Of course, this is not going to be a trivial task and I wouldn't invest into it just for benchmarking.

Oct 01 '24 08:10 dstogov