ghc-mod icon indicating copy to clipboard operation
ghc-mod copied to clipboard

High memory consumption being compiled with GHC 8

Open vasily-kirichenko opened this issue 9 years ago • 12 comments

Editing any file in this repo https://github.com/vasily-kirichenko/haskell-book (for example, this one https://github.com/vasily-kirichenko/haskell-book/blob/master/src/SemigroupsAndMonoids.hs) in Atom (haskell-ide / repl / autocomplete / hasktags / pointful / pointfree plugins) causes ghc-mod process to occupy memory at very high speed. It's ~1.4GiB after a few editions and ~10GiB after a few minutes of editing (after which I kill ghc-mod and it's restarted).

I switched to Stackage 7.0 yesterday and did stack install ghc-mod / stylish-haskell / hlint etc., I believe the leak started to appear after this upgrade, there were no memory problems on Stackage 6.17 - hours of happy work.

Ubuntu 16.04.1 x64, Atom 1.10.0

$ stack repl
...
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
...
$ ghc-mod --version
ghc-mod version 5.6.0.0 compiled by GHC 8.0.1

vasily-kirichenko avatar Sep 16 '16 09:09 vasily-kirichenko

I've confirmed that the issue is caused by Stackage 7.0. I switched to 6.17, stack install ghc-mod, after which ghc-mod reached only ~300MiB of memory after 10 minutes of editing.

vasily-kirichenko avatar Sep 16 '16 09:09 vasily-kirichenko

That file you link to doesn't seem to exist, do you mean src/SemigroupsAndMonoids.hs?

DanielG avatar Sep 17 '16 13:09 DanielG

Yes. Fixed, sorry.

vasily-kirichenko avatar Sep 17 '16 13:09 vasily-kirichenko

@lierdakil can you try to reproduce this? I don't see any abnormal memory usage with just ghc-mod legacy-interactive.

DanielG avatar Sep 17 '16 13:09 DanielG

So... I was able to make memory consumption grow consistently by repeatedly requesting type on literally each character with ghc-mod-5.6 and both ghc-7.10.3 and ghc-8.0.1. It's nowhere near reported levels though. Still it's a fact that memory consumption does consistently grow, so I would argue there's a space leak going on somewhere and it doesn't seem like GC is kicking in at any point. I also noticed that ghc-mod is considerably slower with ghc-8.

Besides, with ghc-8 ghc-mod consumes roughly 5-10 times more memory, but I think that ghc-8.0.1 is known for its memory consumption, so it's not necessarily a problem with ghc-mod itself.

lierdakil avatar Sep 17 '16 15:09 lierdakil

I think I finally found what is leaking space all over the place, see below.

Basically I added a forever (threadDelay 1000000) to the end of main in src/GHCMod.hs this allows everything except this space leak to be garbage collected. Then using a retainers profile +RTS -hr i can see that FastString.<CAF> is holding on to all the remaining memory after we enter that loop. The only thing in that module that looks like it could grow to the sizes I'm seeing is string_table so that's probably the culprit.

First I though that getOrSetLibHSghcFastStringTable looks suspicious but it seems like it doesn't impact GC at all. I've tried re-initializing it with initGlobalStore and that doesn't change anything. I've also tried getting rid of that memory using revertCAFs but that doesn't work, not sure why. Next step would be to try adding a function to GHC to empty out that cache. If that doesn't work either I probably misidentified the culprit :/

{-
Internally, the compiler will maintain a fast string symbol table, providing
sharing and fast comparison. Creation of new @FastString@s then covertly does a
lookup, re-using the @FastString@ if there was a hit.

The design of the FastString hash table allows for lockless concurrent reads
and updates to multiple buckets with low synchronization overhead.

See Note [Updating the FastString table] on how it's updated.
-}
data FastStringTable =
 FastStringTable
    {-# UNPACK #-} !(IORef Int)  -- the unique ID counter shared with all buckets
    (MutableArray# RealWorld (IORef [FastString])) -- the array of mutable buckets

string_table :: FastStringTable
{-# NOINLINE string_table #-}
string_table = unsafePerformIO $ do
  uid <- newIORef 603979776 -- ord '$' * 0x01000000
  tab <- IO $ \s1# -> case newArray# hASH_TBL_SIZE_UNBOXED (panic "string_table") s1# of
                          (# s2#, arr# #) ->
                              (# s2#, FastStringTable uid arr# #)
  forM_ [0.. hASH_TBL_SIZE-1] $ \i -> do
     bucket <- newIORef []
     updTbl tab i bucket

  -- use the support wired into the RTS to share this CAF among all images of
  -- libHSghc
#if STAGE < 2
  return tab
#else
  sharedCAF tab getOrSetLibHSghcFastStringTable

-- from the RTS; thus we cannot use this mechanism when STAGE<2; the previous
-- RTS might not have this symbol
foreign import ccall unsafe "getOrSetLibHSghcFastStringTable"
  getOrSetLibHSghcFastStringTable :: Ptr a -> IO (Ptr a)
#endif

DanielG avatar Jan 07 '17 02:01 DanielG

Filed a GHC bug https://ghc.haskell.org/trac/ghc/ticket/13110.

DanielG avatar Jan 12 '17 13:01 DanielG

Any known workaround about that?

ryukinix avatar Jul 20 '17 11:07 ryukinix

The open ticket is stale; any news?

maelvls avatar Nov 13 '17 16:11 maelvls

Nope. Trying to reproduce the problem with GHC 8.2 would be helpful though, and it would give the GHC guys a nudge too ;)

DanielG avatar Nov 13 '17 16:11 DanielG

If you are affected by this bug feel free to say so in the GHC ticket. Knowing how many users are actually affected by a given issue is hard and that helps.

DanielG avatar Nov 13 '17 16:11 DanielG

That you for the advice!! I should have though of that!

maelvls avatar Nov 13 '17 19:11 maelvls