Cthulhu.jl icon indicating copy to clipboard operation
Cthulhu.jl copied to clipboard

Define a low-level interface to support non-standard compilation

Open serenity4 opened this issue 6 months ago • 1 comments

I would like to share a rough design that I think may be fitting for upcoming refactors, and that would allow non-AbstractInterpreter compilation pipelines such as DAECompiler to integrate well with Cthulhu, to see if anyone has any objections/concerns before I go ahead with a prototype.

Cthulhu currently requires an AbstractInterpreter and expects it to go through a standard compilation pipeline with proper hooks (caching InferenceResult, remarks, etc). This will not work well with other compilation pipelines such as DAECompiler that still may benefit from an integration with Cthulhu. In the case of DAECompiler for example, we start with a MethodInstance that we infer/optimize, then we extract the optimized IR from the resulting CodeInstance; then, we copy it and rewrite parts of it before running IR interpretation on it and we eventually save the IR in a CodeInstance that is different from the post-inference one. There may be other IRs that could be interesting to take a look at in DAECompiler which undergo deeper transformations, or that may be complete rewrites. However, from the perspective of Cthulhu, we start with a MethodInstance (and a bunch of settings) and we emit a CodeInstance for which we can't assume it will be obtained through just a standard compilation pipeline with an AbstractInterpreter.

Eventually, one should be able to provide the relevant pieces of information that Cthulhu needs, even though optional information such as effects may be missing: we do have a basic CodeInstance with inferred code that we should be able to display (showing custom remarks may be very helpful already for example, in addition of the IR itself).

Therefore, I believe it would be helpful to have a low-level interface that allows any user implementation to retrieve the bits of information that Cthulhu needs to run its introspection and display systems. Then, the current interface (based on AbstractInterpreter) can sit on top of that low-level interface. This would further decouple Cthulhu from Compiler, as we would have the low-level interface in one part, then the "high"-level interface with AbstractInterpreter/CthulhuInterpreter in another.

The low-level interface would be marked as experimental, to accomodate possibly breaking changes as the Julia compiler evolves. Non-exhaustively, the interface would consist of a set of generic functions to:

  • Map a signature tuple to a MethodInstance (in the case method matching is customized via overlayed method tables).
  • Map a MethodInstance to an inferred[^1] CodeInstance.
  • Map a MethodInstance + CodeInstance to a CodeInfo/IRCode (as the .inferred field of a CodeInstance may not be an instance of one of these types)
  • Retrieve effects, remarks, and other code metadata.

I believe that conversion to the various code representations (source, lowered, llvm, native) is based on MethodInstance/CodeInfo already, so the low-level interface would mostly cover the generation of typed IR given a MethodInstance/signature tuple.

Any feedback would be highly appreciated. (cc @Keno, @aviatesk)

[^1]: We could explicit in the interface whether an implementation supports disabling or enabling optimization, in the cases where we don't have both the pre- and post-optimization versions of the IR available.

serenity4 avatar Jun 20 '25 03:06 serenity4

I think this refactoring approach is a really good idea overall. I've wanted descend(...; interp=MyInterpreter()) to be truly functional for years, but I still haven't gotten around to it. I'd be really happy if you could do it.

I mostly agree with the basic design of the interface. What I was thinking was, to prepare a type like abstract type CthulhuInfoProvider <: AbstractInterpreter and provide default interface implementations for CthulhuInfoProvider. That way, you could get basic integration just by having struct MyInterpreter <: CthulhuInfoProvider. But I'm not sure yet if this approach is the best. I also think it might be fine to provide the interface purely as generic functions, without requiring type inheritance. We could still provide a set of default implementations for the interface.

aviatesk avatar Jun 24 '25 11:06 aviatesk