singletons Figure out how match flattening affects us

In the master branch, all pattern-matches are "flattened" before further processing. This is done from scLetDec in squashWildcards, called in singTopLevelDecs. This is a drastic change, meaning that no patterns are overlapped anymore. How is this change affecting us? Is the code bloat truly terrible? How are compilation times? Pattern-match flattening allows some nice things -- in particular, it allows singletonization of functions with overlapping patterns. But is the cost too high? If the cost is high, is there a way to detect when match flattening is a good idea? Should we ask the user for assistance?

Apr 09 '15 16:04 goldfirere

It looks like match flattening was the cause of huge memory usage during compilation of Data.Singletons.Prelude.List module. See #96.

Apr 19 '15 07:04 jstolarek

It also seems that there might have been a bug in the implementation of match flattening, which caused failure of Singletons/Nat test. See #97.

Apr 19 '15 08:04 jstolarek

There is lots of good data on this issue in #96. But it's not exactly the data that we need.

The big question I have about the performance blowup is where, precisely, it's happening.

First, we need to know what the trigger is. The data in #96 suggests that the blowup happens with both zipWithN and unzipN. unzipN is simpler, so let's focus on that. Question 1: why unzipN and not zipN?

Question 2: Where, precisely, is the blowup happening? Here are some possibilities:

th-desugar takes a long time to process the code, but its output is reasonable.
th-desugar produces lots and lots of code. Way more than it needs to. A good basis of comparison here is to see what GHC produces when it translates normal (that is, with no TH silliness) definitions of unzipN to Core. The match-flattening algorithm is copied directly out of GHC, so the output should be similar.
singletons takes a very long time processing the output of th-desugar. By "very long", I mean worse than linear in the length of the th-desugar output (do not compare against the N in unzipN -- we need to look at the size of the AST that th-desugar produces).
singletons produces lots and lots of code during promotion. Way more than it needs to. A good basis of comparison is to take the output of th-desugar and promote by hand. Is the trickery that singletons is doing to get promotion working producing the blowup? Or is it doing a reasonably efficient thing?
singletons produces lots and lots of code during singletonization. Way more than it needs to. A good basis of comparison is to take the output of th-desugar and singletonize by hand. Is the trickery that singletons is doing to deal with binders producing the blowup? Or is it doing a reasonably efficient thing?
GHC's typechecker is choking on the output from singletons. That is, GHC's performance is worse-than-linear in the size of the code passed to it. But the output of the typechecker (as seen by -ddump-ds) is reasonable. Does it matter if you write the code by hand vs. by TH? If you were to write unzipN in a natural way (without all the silliness that singletons does), is GHC still slow? Dealing with the promoted code or the singleton code? Can we isolate the possibility that the problem is simply type family reduction? (Type family reduction has been a bottleneck in the past.)
GHC's translation to GHC code is choking. That is, the size of the -ddump-ds output is growing too fast. Way faster than it should.

Perhaps the blowup is happening at multiple places! But we need to discover which pass is causing problems before we can start addressing them. I don't think it's terribly hard to do this, but it would take some time and attention to details. Once we're sure that unzipN causes the problem, it may just be easiest to work with unzip7 and observe what happens manually -- I'm not convinced that finding the growth rate is going to unlock this problem as much as a higher-level understanding of what's wrong.

And then there's question 3: even when things aren't terrible, how bad is match-flattening in the average case? Does it increase all compilations by 20%? Code size? Is it worth the cost?

If you want to explore this, you'll want to enable match-flattening again. Take a look at commit 87aa90120e7335c342cf959dbc283c9132ffa08d. Restore the Data.Singletons.Single.Squash module and the few lines of change in Data.Singletons.Single. There's a lot of other stuff in that commit, but don't worry about it -- none of it should matter for testing purposes.

Do let me know if you're taking this on! :)

Oct 31 '15 15:10 goldfirere

Do let me know if you're taking this on! :)

Not in the next weeks - I'm totally swamped with teaching and once this is over (in two weeks) I plan to take on generalized injectivity.

Nov 02 '15 13:11 jstolarek

To be clear: my "you" was addressed to any reader out there in cyberspace. Not you, Janek, specifically. :)

Nov 02 '15 15:11 goldfirere

I admit that this possibility crossed my mind when writing my comment :-)

Nov 02 '15 15:11 jstolarek

singletons singletons copied to clipboard

Figure out how match flattening affects us

singletons
singletons copied to clipboard