happy
happy copied to clipboard
GHC takes a very long time to compile parser generated by Happy 1.19.8 compared to Happy 1.19.5
I maintain a compiler that uses a reasonably sized parser written with Happy: https://github.com/diku-dk/futhark/blob/e3f5344ac74f80a7d74ee57cbff27597b86eb545/src/Language/Futhark/Parser/Parser.y
There is nothing fancy going on, and I believe that I have type signatures for all productions. I do use parameterized productions a little bit, though. Unfortunately, while the parser generated by Happy 1.19.5 compiles with GHC in a dozen seconds, the parser generated by Happy 1.19.8 takes upward of twenty minutes. I have identified these hole-y type signatures as the culprit:
#if __GLASGOW_HASKELL__ >= 710
happyReduce_6 :: () => Happy_GHC_Exts.Int# -> L Token -> Happy_GHC_Exts.Int# -> Happy_IntList -> HappyStk (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) -> ParserMonad (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _)
#endif
If I remove these type signatures, then compile times drop down to a dozen seconds again. Perhaps GHC has some quadratic behaviour in the implementation of typed holes, or perhaps Happy shouldn't be generating these - I'm not sure. But given the relative simplicity of my grammar, I don't think I'll be the only one encountering this. Is there perhaps a workaround, or am I doing something stupid?
(Happy 1.19.6 and 1.19.7 produce code that does not compile for this parser, likely due to #94.)
Significantly reducing (but not totally eliminating) the use of parameterized productions seems to function as a workaround.
Unless most of GHC's work here is parsing (something I very much doubt, given that we are talking about a huge slowdown), this sounds like GHC could be at fault here - we are giving it strictly more information about the types, yet it is taking much longer.
Yes, I think you are correct. Probably the implementation of typed holes was not written to cope with large quantities of holes.
I had this bug too, but then it disappeared mysteriously. Now it is back.
My generated Parser.hs
contains 92 functions with 92 arguments and 8.2.1 takes forever to compile. I left if to run for the weekends and it still wasn't done.
UPD: with 8.2.2 it dropped to a dozen-minutes range. Still unacceptable if you want to hack on the parser itself.
Does anyone have a standalone repro for this bug please?
Not exactly minimal and does not resemble the real parser, but has almost same number of type vars used and, I hope, shows the slowdown.
https://github.com/wiz/too-happy
I did a round of too-happy
builds with different GHC versions.
7.10 takes 20 seconds to build. 8.0, 8.2.1 and 8.2.2 all take 3.5 minutes.
@wiz That sounds a look like a GHC regression with respect to partial type signatures. Do you want to report it to the GHC Trac?
That makes sense. But, given the time scale of GHC fixes and releases, we wouldn't see the results of the fix for quite a time. If package could provide a workaround for GHC 8.{0,2} this would save us a lot of compilation time until a 8.4-enabled LTS snapshot comes out.
@wiz I completely agree. It is going to be difficult to do that without breaking backwards compatibility though (perhaps that's OK) - I spent a long time trying to figure out how to fix https://github.com/simonmar/happy/issues/94 and PartialTypeSignatures
were the only way I found to not regress in features (here is the commit: https://github.com/harpocrates/happy/commit/b79dbdbcd1530ffd0916317e08097b4a7f5d0fd0).
I see a couple avenues forward, but I will not be implementing any of them:
- do not generate the problematic type signatures only when there is a certain flag enabled (grammars not relying on https://github.com/simonmar/happy/pull/49 could enable that flag)
- revert PR https://github.com/simonmar/happy/pull/49 and https://github.com/harpocrates/happy/commit/b79dbdbcd1530ffd0916317e08097b4a7f5d0fd0 (grammars relying on https://github.com/simonmar/happy/pull/49 would simply be broken - we go back to the behaviour Happy had in in 1.19.5)
- we fix GHC's regression
Also, if you would like to undertake the last of these (my favorite), getting a fix into GHC 8.4.1 will be tough (the branch has been cut and the release is due in only a couple of weeks).
Note that I just tried this on my working GHC HEAD and performance is even worse... 😞
Actually, perhaps there is a mediocre (hopefully temporary) workaround: do not generate signatures when there is an empty monad context (basically, pay the price of partial type signatures only when you want the features of https://github.com/simonmar/happy/pull/49).
Reported: https://ghc.haskell.org/trac/ghc/ticket/14683#ticket
While producing a test case, I realized this has nothing to do with Happy 1.19.5 or 1.19.8 or partial type signatures - it is just a straight GHC compile time regression. After inlining the data definition of Token
from too-happy
, here are the compile times of Grammer.hs
:
GHC 7.10.3 | GHC 8.0.2 | GHC 8.2.2 | GHC HEAD | |
---|---|---|---|---|
Happy 1.19.5 | 17.37s | 127.89s | 115.26s | 125.44s |
Happy 1.19.8 | 17.71s | 129.23s | 114.42s | 128.42s |
So: please ignore everything I've said so far in this thread and refer to the GHC thread (https://ghc.haskell.org/trac/ghc/ticket/14683#ticket).
I've noticed some #if
s in generated code containing types with holes and added a hack to redefine __GLASGOW_HASKELL__
to 709 in the parser file to remove those and the slowdown got reduced massively.
{
...
#undef __GLASGOW_HASKELL__
#define __GLASGOW_HASKELL__ 709
}
#if __GLASGOW_HASKELL__ >= 710
happyReduce_1 :: () => Happy_GHC_Exts.Int# -> T.Token -> Happy_GHC_Exts.Int# -> Happy_IntList -> HappyStk (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) -> P (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _)
#endif
However I could not reproduce this hack with the too-happy
example because there was no code generated inside #if
s :confused:
#if __GLASGOW_HASKELL__ >= 710
#endif
Snapshot (lts-10.3) and happy(1.19.8) versions used were the same.
@wiz Do you have a commit of your project you could point me to where you are seeing this longer compilation times? I've tried compiling master
(of Futhark) with Happy 1.19.8 and a recent GHC and compile times seemed fine...
@harpocrates The code haven't changed. We were forced to move to 8.x series and after patching some imports everything but parser compilation time was fine.
If these slow-downs are really related to the type holes in certain versions of GHC, then maybe happy should not add them unless it is a version of GHC known to not have that regression?
@nomeata Do you have an example of a slowdown caused by type-holes? If so, I would like to distill it to a test case for a GHC bug. I haven't been able to replicate these slowdowns.
Yes; see the commit https://github.com/antalsz/hs-to-coq/commit/c538e6a92362d22cf6abf45c8f2746ea938ac35a. Without this hack, GHC-8.0.2 and latest happy, GHC sits there staring at Parsers.hs
; with this hack compilation is fast.
Thanks @nomeata, I can reproduce now! I've opened https://ghc.haskell.org/trac/ghc/ticket/14766 for this. It looks partial type signatures are an issue. Copying over the table of compile-times from the ticket, I observed for Happy-1.19.8:
GHC 8.0.2 | GHC 8.2.2 | GHC HEAD | |
---|---|---|---|
Signatures disabled using CPP hack | 24.13s | 22.93s | 34.05s |
With signatures | >15m | ~13m | >15m |
Hi all,
Have we seen any movement on this issue? I noticed that https://gitlab.haskell.org/ghc/ghc/issues/14683 has been closed, but I am still having the issue with GHC 8.6.5
I've been working on a standalone parser for the MLIR syntax (https://gitlab.com/arc-compiler/arc2-mlir/blob/master/src/ARC/MLIR/Parser.y) which is now taking upwards of 30 minutes to compile with GHC 8.6.5, making iterating on the parser very painful and time consuming.
Updating this for posterity (since this issue is high on the list of Google results for GHC slowdown compiling Happy-generated code).
After applying the CPP hack I was still getting the same ludicrous compile time with GHC 8.6.5. I decided to just throw up my hands and try the most recent GHC I could find in the hope that the performance issues were fixed there. I'm happy to report that with GHC 8.8.1, the 30 minute compile time for the Happy parser I linked above is reduced to about ten seconds! So GHC 8.8.1 (specifically, the build in the stack nightly resolver right now) does not exhibit this slowdown compiling Happy-generated parsers.
Please update the title of this issue to reflect the results of the discussion:
- include the range of GHC versions affected by this regresssion(?)
- is it actually a happy issue (generation of partial type signatures) or just a GHC issue?
Closing this issue; it is likely just an issue within GHC that long has been resolved. Besides, this issue has been dead for nearly 5 years. If it becomes pressing again, feel free to reopen.