mlton icon indicating copy to clipboard operation
mlton copied to clipboard

Long compile times for RedPRL

Open MatthewFluet opened this issue 7 years ago • 4 comments

See RedPRL/sml-redprl#394; present at RedPRL/sml-redprl@ba89d597751dfccdde2d3dcc0f690a985b646e3b.

Both the native amd64 and C codegens exhibit excessively long compile times (> 10min), which suggests that the Machine IR program has some inherent structure that is poorly handled. The C codegen generates a file with 28807 local CPointer variables, which takes gcc >10min to compile.

MatthewFluet avatar Oct 06 '17 00:10 MatthewFluet

Incidentally, I was talking about this with Frank Pfenning today and he said he had a similar issue using MLton to build the compiler for C0, and he worked around it by breaking up large case statements into auxiliary functions... I wonder if something is getting out of control in that neighborhood.

jonsterling avatar Oct 06 '17 22:10 jonsterling

@jonsterling I don't think that large (source) case statements are the problem, but mentioning C0 made me remember that @robsimmons reported a similar issue about C0: https://sourceforge.net/p/mlton/mailman/message/31031310/

I think that the comments in that thread are probably relevant. In particular, that code is being generated that has worst-case liveness analysis. Looking at some of the intermediate language programs during a compilation of redprl, it looks like there is an SSA IR function that 202757 basic blocks; in a self-compile of mlton, the largest number of basic blocks in any SSA IR function is 5241. I haven't been able to piece together what source function it corresponds to, but the reason that it has this massive number of basic blocks is that it is a function that has 217 case transfers over a datatype with 759 constructors. That datatype was introduced by MLton's closure-conversion via defunctionalization; it means that MLton's 0CFA has determined that 759 distinct source lambdas can flow to these application sites. My best guess is that it is due to some code written in CPS and/or some partial applications of curried functions.

MatthewFluet avatar Oct 07 '17 01:10 MatthewFluet

BTW, compiling redprl with -native-live-transfer 0 as suggested in the linked thread gives:

MLton starting
   Compile SML starting
      pre codegen starting
      pre codegen finished in 59.26 + 10.53 (15% GC)
      amd64 code gen starting
      amd64 code gen finished in 112.16 + 12.80 (10% GC)
   Compile SML finished in 171.46 + 23.33 (12% GC)
   Compile and Assemble starting
   Compile and Assemble finished in 6.98 + 0.00 (0% GC)
   Link starting
   Link finished in 0.94 + 0.00 (0% GC)
MLton finished in 179.51 + 23.36 (12% GC)

MatthewFluet avatar Oct 07 '17 01:10 MatthewFluet

@MatthewFluet Thank you, that did the trick!

jonsterling avatar Oct 08 '17 01:10 jonsterling