happy icon indicating copy to clipboard operation
happy copied to clipboard

GHC takes a very long time to compile parser generated by Happy 1.19.8 compared to Happy 1.19.5

Open athas opened this issue 7 years ago • 23 comments

I maintain a compiler that uses a reasonably sized parser written with Happy: https://github.com/diku-dk/futhark/blob/e3f5344ac74f80a7d74ee57cbff27597b86eb545/src/Language/Futhark/Parser/Parser.y

There is nothing fancy going on, and I believe that I have type signatures for all productions. I do use parameterized productions a little bit, though. Unfortunately, while the parser generated by Happy 1.19.5 compiles with GHC in a dozen seconds, the parser generated by Happy 1.19.8 takes upward of twenty minutes. I have identified these hole-y type signatures as the culprit:

#if __GLASGOW_HASKELL__ >= 710
happyReduce_6 :: () => Happy_GHC_Exts.Int# -> L Token -> Happy_GHC_Exts.Int# -> Happy_IntList -> HappyStk (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) -> ParserMonad (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _)
#endif

If I remove these type signatures, then compile times drop down to a dozen seconds again. Perhaps GHC has some quadratic behaviour in the implementation of typed holes, or perhaps Happy shouldn't be generating these - I'm not sure. But given the relative simplicity of my grammar, I don't think I'll be the only one encountering this. Is there perhaps a workaround, or am I doing something stupid?

(Happy 1.19.6 and 1.19.7 produce code that does not compile for this parser, likely due to #94.)

athas avatar Dec 20 '17 12:12 athas

Significantly reducing (but not totally eliminating) the use of parameterized productions seems to function as a workaround.

athas avatar Dec 20 '17 13:12 athas

Unless most of GHC's work here is parsing (something I very much doubt, given that we are talking about a huge slowdown), this sounds like GHC could be at fault here - we are giving it strictly more information about the types, yet it is taking much longer.

harpocrates avatar Dec 27 '17 17:12 harpocrates

Yes, I think you are correct. Probably the implementation of typed holes was not written to cope with large quantities of holes.

athas avatar Dec 27 '17 23:12 athas

I had this bug too, but then it disappeared mysteriously. Now it is back.

My generated Parser.hs contains 92 functions with 92 arguments and 8.2.1 takes forever to compile. I left if to run for the weekends and it still wasn't done.

UPD: with 8.2.2 it dropped to a dozen-minutes range. Still unacceptable if you want to hack on the parser itself.

dpwiz avatar Jan 10 '18 14:01 dpwiz

Does anyone have a standalone repro for this bug please?

simonmar avatar Jan 10 '18 16:01 simonmar

Not exactly minimal and does not resemble the real parser, but has almost same number of type vars used and, I hope, shows the slowdown.

https://github.com/wiz/too-happy

dpwiz avatar Jan 10 '18 18:01 dpwiz

I did a round of too-happy builds with different GHC versions.

7.10 takes 20 seconds to build. 8.0, 8.2.1 and 8.2.2 all take 3.5 minutes.

dpwiz avatar Jan 11 '18 13:01 dpwiz

@wiz That sounds a look like a GHC regression with respect to partial type signatures. Do you want to report it to the GHC Trac?

harpocrates avatar Jan 11 '18 20:01 harpocrates

That makes sense. But, given the time scale of GHC fixes and releases, we wouldn't see the results of the fix for quite a time. If package could provide a workaround for GHC 8.{0,2} this would save us a lot of compilation time until a 8.4-enabled LTS snapshot comes out.

dpwiz avatar Jan 16 '18 12:01 dpwiz

@wiz I completely agree. It is going to be difficult to do that without breaking backwards compatibility though (perhaps that's OK) - I spent a long time trying to figure out how to fix https://github.com/simonmar/happy/issues/94 and PartialTypeSignatures were the only way I found to not regress in features (here is the commit: https://github.com/harpocrates/happy/commit/b79dbdbcd1530ffd0916317e08097b4a7f5d0fd0).

I see a couple avenues forward, but I will not be implementing any of them:

  • do not generate the problematic type signatures only when there is a certain flag enabled (grammars not relying on https://github.com/simonmar/happy/pull/49 could enable that flag)
  • revert PR https://github.com/simonmar/happy/pull/49 and https://github.com/harpocrates/happy/commit/b79dbdbcd1530ffd0916317e08097b4a7f5d0fd0 (grammars relying on https://github.com/simonmar/happy/pull/49 would simply be broken - we go back to the behaviour Happy had in in 1.19.5)
  • we fix GHC's regression

Also, if you would like to undertake the last of these (my favorite), getting a fix into GHC 8.4.1 will be tough (the branch has been cut and the release is due in only a couple of weeks).

Note that I just tried this on my working GHC HEAD and performance is even worse... 😞

harpocrates avatar Jan 16 '18 23:01 harpocrates

Actually, perhaps there is a mediocre (hopefully temporary) workaround: do not generate signatures when there is an empty monad context (basically, pay the price of partial type signatures only when you want the features of https://github.com/simonmar/happy/pull/49).

harpocrates avatar Jan 18 '18 00:01 harpocrates

Reported: https://ghc.haskell.org/trac/ghc/ticket/14683#ticket

harpocrates avatar Jan 18 '18 00:01 harpocrates

While producing a test case, I realized this has nothing to do with Happy 1.19.5 or 1.19.8 or partial type signatures - it is just a straight GHC compile time regression. After inlining the data definition of Token from too-happy, here are the compile times of Grammer.hs:

GHC 7.10.3 GHC 8.0.2 GHC 8.2.2 GHC HEAD
Happy 1.19.5 17.37s 127.89s 115.26s 125.44s
Happy 1.19.8 17.71s 129.23s 114.42s 128.42s

So: please ignore everything I've said so far in this thread and refer to the GHC thread (https://ghc.haskell.org/trac/ghc/ticket/14683#ticket).

harpocrates avatar Jan 19 '18 00:01 harpocrates

I've noticed some #ifs in generated code containing types with holes and added a hack to redefine __GLASGOW_HASKELL__ to 709 in the parser file to remove those and the slowdown got reduced massively.

{
...
#undef __GLASGOW_HASKELL__
#define __GLASGOW_HASKELL__ 709
}
#if __GLASGOW_HASKELL__ >= 710
happyReduce_1 :: () => Happy_GHC_Exts.Int# -> T.Token -> Happy_GHC_Exts.Int# -> Happy_IntList -> HappyStk (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _) -> P (HappyAbsSyn _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _)
#endif

However I could not reproduce this hack with the too-happy example because there was no code generated inside #ifs :confused:

#if __GLASGOW_HASKELL__ >= 710
#endif

Snapshot (lts-10.3) and happy(1.19.8) versions used were the same.

dpwiz avatar Jan 19 '18 09:01 dpwiz

@wiz Do you have a commit of your project you could point me to where you are seeing this longer compilation times? I've tried compiling master (of Futhark) with Happy 1.19.8 and a recent GHC and compile times seemed fine...

harpocrates avatar Jan 20 '18 05:01 harpocrates

@harpocrates The code haven't changed. We were forced to move to 8.x series and after patching some imports everything but parser compilation time was fine.

dpwiz avatar Jan 22 '18 08:01 dpwiz

If these slow-downs are really related to the type holes in certain versions of GHC, then maybe happy should not add them unless it is a version of GHC known to not have that regression?

nomeata avatar Jan 29 '18 16:01 nomeata

@nomeata Do you have an example of a slowdown caused by type-holes? If so, I would like to distill it to a test case for a GHC bug. I haven't been able to replicate these slowdowns.

harpocrates avatar Feb 05 '18 01:02 harpocrates

Yes; see the commit https://github.com/antalsz/hs-to-coq/commit/c538e6a92362d22cf6abf45c8f2746ea938ac35a. Without this hack, GHC-8.0.2 and latest happy, GHC sits there staring at Parsers.hs; with this hack compilation is fast.

nomeata avatar Feb 05 '18 01:02 nomeata

Thanks @nomeata, I can reproduce now! I've opened https://ghc.haskell.org/trac/ghc/ticket/14766 for this. It looks partial type signatures are an issue. Copying over the table of compile-times from the ticket, I observed for Happy-1.19.8:

GHC 8.0.2 GHC 8.2.2 GHC HEAD
Signatures disabled using CPP hack 24.13s 22.93s 34.05s
With signatures >15m ~13m >15m

harpocrates avatar Feb 06 '18 17:02 harpocrates

Hi all,

Have we seen any movement on this issue? I noticed that https://gitlab.haskell.org/ghc/ghc/issues/14683 has been closed, but I am still having the issue with GHC 8.6.5

I've been working on a standalone parser for the MLIR syntax (https://gitlab.com/arc-compiler/arc2-mlir/blob/master/src/ARC/MLIR/Parser.y) which is now taking upwards of 30 minutes to compile with GHC 8.6.5, making iterating on the parser very painful and time consuming.

andrew-wja avatar Nov 24 '19 18:11 andrew-wja

Updating this for posterity (since this issue is high on the list of Google results for GHC slowdown compiling Happy-generated code).

After applying the CPP hack I was still getting the same ludicrous compile time with GHC 8.6.5. I decided to just throw up my hands and try the most recent GHC I could find in the hope that the performance issues were fixed there. I'm happy to report that with GHC 8.8.1, the 30 minute compile time for the Happy parser I linked above is reduced to about ten seconds! So GHC 8.8.1 (specifically, the build in the stack nightly resolver right now) does not exhibit this slowdown compiling Happy-generated parsers.

andrew-wja avatar Nov 24 '19 23:11 andrew-wja

Please update the title of this issue to reflect the results of the discussion:

  • include the range of GHC versions affected by this regresssion(?)
  • is it actually a happy issue (generation of partial type signatures) or just a GHC issue?

andreasabel avatar Feb 15 '21 13:02 andreasabel

Closing this issue; it is likely just an issue within GHC that long has been resolved. Besides, this issue has been dead for nearly 5 years. If it becomes pressing again, feel free to reopen.

sgraf812 avatar Sep 13 '24 09:09 sgraf812