mpl icon indicating copy to clipboard operation
mpl copied to clipboard

initial work on flattening all immutable tuples into sequence containers

Open shwestrick opened this issue 2 months ago • 1 comments

Introduces an SSA2 pass called flattenIntoSequences, implemented within mlton/ssa/flatten-into-sequences.fun.

The idea in the flattenIntoSequences pass is to force flattening of immutable tuples into sequence containers. For example the SSA2 type ((real64, word32) tuple, word8) tuple mut sequence would be rewritten to (real64 mut, word32 mut, word8 mut) sequence, generating the appropriate compensation code at sequence loads and stores.

When it works, the performance benefits are significant.

This is an attempt to address a problem with the deepFlatten pass, which does not always succeed at flattening. The specifics are a bit mysterious. We will need to more closely investigate where (and why) deepFlatten chooses not to flatten into sequences.

One issue with flattenIntoSequences at the moment is that it blindly flattens, which may not be correct in all cases, e.g., for primitive CAS operations on tuples.

Current status

I've found at least one example (a quickhull benchmark) which is not compiling correctly. More investigation needed...

That being said, the nn example seems to be working correctly, with significant performance improvements (measurements taken on my Mac M2, 2022):

$ bin/nn @mpl procs 4 -- -N 10000000   # with new pass
N 10000000
generated input in 0.0303s
built quadtree in 0.8498s
found all neighbors in 1.2261s
...

$ bin/nn.sysmpl @mpl procs 4 -- -N 10000000   # without new pass
N 10000000
generated input in 0.1686s
built quadtree in 0.9750s
found all neighbors in 1.6588s
...

shwestrick avatar Oct 14 '25 08:10 shwestrick

More progress: the quickhull correctness issue seems to have been a red herring. Previously, I was compiling MaPLe with make smlnj-mlton, but when I switched to standard make the correctness issue went away.

(The correctness issue seemed to be due to problems with real arithmetic. We may need to investigate this separately; perhaps some deeper issue with the 'make smlnj-mlton` build target)

As of now, flattening seems to be working correctly. The results are really promising.

Lots more testing will be needed, and the issue with CAS is going to be difficult (and interesting!) to solve.

shwestrick avatar Oct 24 '25 00:10 shwestrick