js_of_ocaml icon indicating copy to clipboard operation
js_of_ocaml copied to clipboard

Separate compilation: generate and use cross module information regarding function arity

Open vouillon opened this issue 9 years ago • 8 comments

We could greatly improve the code generated when compiling cmo files if we knew the arity of the functions exported by each module. This information could be stored in a separate file for each module compiled when the --dynlink option is used, or when compiling cmo and cma files.

vouillon avatar Dec 06 '16 09:12 vouillon

:+1: Open question: should we share this format with bucklescript https://github.com/bloomberg/bucklescript/blob/master/jscomp/core/js_cmj_format.mli

hhugo avatar Dec 06 '16 11:12 hhugo

There is no reason to share the same format, as the generated codes are not compatible.

vouillon avatar Dec 06 '16 12:12 vouillon

@vouillon, any idea on how to distinguish immutable (module) block from other possibly mutable ones? We could limit the optimisation to "compilation unit" modules and ignore sub-module but it would be a bit sad

hhugo avatar Mar 09 '24 08:03 hhugo

One can get some type information from the debugging events.

vouillon avatar Mar 12 '24 10:03 vouillon

I took a look at debug events. I think there are not enough. We can recover some module info when the module appears on the ce_stack and the corresponding Env_module appears in the env summary but

  • submodules that are constrained/coerced by an mli don't seem to appear on ce_stack
  • toplevel modules / compilation units don't appear on ce_stack. (Not a real issue as we can look for SETGLOBAL to spot them).

I would be convenient to retrieve information about Immutable blocks. I wonder if we can store that information inside debug events or it we need some other mechanism.

hhugo avatar Apr 05 '24 15:04 hhugo

We can get some information from the cmi files. But I'm not sure we want to go this way. (One may have to expand some module types, and thus load cmi files corresponding to other toplevel modules.)

The information in ce_stack could still be useful internally to improve our analyses.

Maybe the module Bytegen could emit some information about block mutability, in a way similar to how debug events are produced. There is some more information in the lambda code which is lost when generating the bytecode. It might make sense to preserve it as well. For instance, int32/in64 comparisons are translated into generic comparisons. Or we loose any information on the kind and layout of bigarrays.

vouillon avatar Apr 05 '24 16:04 vouillon

@gasche, I'm pointing your here to give you an early notice of what we might want to contribute upstream at some point. I don't have time to work on a PoC or RFC right now. In short, there are information in lambda that are not propagate to byte-code. Jsoo would most likely benefit from such information. In particular, it would be useful to know that a block is immutable. We could maybe store additional information in the debug section, after debug events. Or even create a new section.

hhugo avatar Apr 06 '24 12:04 hhugo