haddock
haddock copied to clipboard
Haddock uses a lot of memory
I've been doing some profiling of haddock so this ticket is for me to write down what I have found out.
Here is the allocation graph when running haddock on base
.
The only observation I have at the moment is that it seems that the specialisation code leaks quite a lot of memory more than the previous version.
It might be worth investigating the suggestion on the old ticket to write interfaces to file but I have no intuition as to how much space this will save.
More graphs, hy profile and with the hyperlinker enabled
https://usercontent.irccloud-cdn.com/file/l7vmLw7e/
https://usercontent.irccloud-cdn.com/file/rN8oni3d/hyperlinked
Here is the .prof. http://lpaste.net/147305
@mrhania Is there any chance you could look at this?
Yeah, so the code was written with simplicity in mind and I haven't put too much thought into the performance (and honestly, I haven't had much experience with performance tweaking in Haskell at the time). For example, for almost each token type I run a separate query on the Haskell AST which is just plain terrible.
I will rework this at some point, I am just not sure when as I am quite a busy person right now. But I will, that's for sure.
+1
@mrhania When will the day come?
@andreasabel @asr have you tried building haddock documentation with haddock-2.18.1 (it is bundled with ghc-8.2.1)? Much has been done in this area. Please report back if you do a before-after comparison. Curious to know if it improved for you.
@alexbiehl Out of interest, what was done more concretely? The biggest issue with Haddock has always been that it has to ask GHC to re-do a lot of work, load modules, typecheck, …. Were there improvements in that area? I always suspected that improvements to Haddock code itself would always provide only a minor benefit and its the use of GHC API would have to get smarter.
- Optimization of syb-traversal in the hyperlinker (see https://github.com/haskell/haddock/pull/621).
- AttachInstances makes use of
getNameToInstancesIndex
from recent GHCs (see https://github.com/haskell/haddock/pull/636). - Haddock now ignores optimization flags passed by Cabal (see https://github.com/haskell/haddock/commit/e9cd7b1b52228b9ef8e1bd4e6cb1f2583740fcee).
Currently in ghc-head:
- Instead of enabling compilation on all modules in a package if one module uses TemplateHaskell GHC is now smart enough to figure out only the modules which really need compilation (see https://github.com/haskell/haddock/pull/624).
But still in general if a package uses TemplateHaskell haddock is quite slow as it needs to generate code first.
@andreasabel @asr have you tried building haddock documentation with haddock-2.18.1 (it is bundled with ghc-8.2.1)? Much has been done in this area. Please report back if you do a before-after comparison. Curious to know if it improved for you.
When building Haddock documentation for Agda there is a significant improve using Haddock 2.18.1. Thanks!
Unfortunately I'm having problems for reporting a comparison:
- First, I enabled the profiling in Haddock 2.18.1
$ cabal get haddock-2.18.1
$ cd haddock-2.18.1
$ cabal install --enable-profiling --program-suffix=-2.18.1
- I saw that
cabal
is callinghaddock
in the Agda upstream repository by creating a file and running Haddock on this file
$ git clone https://github.com/agda/agda.git
$ cd agda
$ cabal haddock -v2 > /tmp/log.txt
$ cat /tmp/log.txt
...
dist/doc/html/Agda/haddock-response27033-1.txt contents: <<<
--prologue=dist/doc/html/Agda/haddock-prologue27033-0.txt
--dump-interface=dist/doc/html/Agda/Agda.haddock
...
>>> dist/doc/html/Agda/haddock-response27033-1.txt
/usr/local/bin/haddock '@dist/doc/html/Agda/haddock-response27033-1.txt'
-
I created the haddock-response.txt file (I didn't include the above
--prologue=...
option). -
Now, running Haddock on the above file I got the following error:
$ haddock-2.18.1 '@haddock-response.txt'
...
100% ( 7 / 7) in 'Agda.Syntax.Parser.Layout'
<command line>: can't load .so/.DLL for: libHSunordered-containers-0.2.8.0-4H1ZIlO2rbIFiwyuQdvUPX.so (libHSunordered-containers-0.2.8.0-4H1ZIlO2rbIFiwyuQdvUPX.so: cannot open shared object file: No such file or directory)
Am I missing something?
@asr Thanks for confirming!
So you have built a profiled haddock. You should be able to do:
cabal haddock --with-haddock="<path to your haddock-2.18.1>" --haddock-options="+RTS -p -hy -RTS"
Other than that your error seems to be related to GHCs linker which is used with TemplateHaskell. But I am not sure how to fix that. I am just hoping cabal does the right thing (tm) when invoking haddock.
cabal haddock --with-haddock="<path to your haddock-2.18.1>" --haddock-options="+RTS -p -hy -RTS"
Running the above command I got the same error:
$ cabal haddock --with-haddock="haddock-2.18.1" --haddock-options="+RTS -p -hy -RTS"
...
100% ( 7 / 7) in 'Agda.Syntax.Parser.Layout'
<command line>: can't load .so/.DLL for: libHSunordered-containers-0.2.8.0-4H1ZIlO2rbIFiwyuQdvUPX.so (libHSunordered-containers-0.2.8.0-4H1ZIlO2rbIFiwyuQdvUPX.so: cannot open shared object file: No such file or directory)
@asr Awww. I have one last idea: try adding -optghc=-dynamic
to the haddock-options
parameter.
It didn't work.
I got the error only for using a profiled Haddock:
$ cabal haddock --with-haddock=haddock-2.18.1
100% ( 7 / 7) in 'Agda.Syntax.Parser.Layout'
<command line>: can't load .so/.DLL for: libHSunordered-containers-0.2.8.0-4H1ZIlO2rbIFiwyuQdvUPX.so (libHSunordered-containers-0.2.8.0-4H1ZIlO2rbIFiwyuQdvUPX.so: cannot open shared object file: No such file or directory)
If I use a non-profiled Haddock, it works.
I have not been able to reproduce the problem with other libraries.
Just to add another data point: There seems to be a severe regression with recent Haddock versions regarding memory usage. This came up on Travis-CI for my OpenGLRaw
project, see https://travis-ci.org/haskell-opengl/OpenGLRaw/builds/356459606. Starting with GHC 8.4.1 the Haddock shipped with it, the build gets killed due to OOM. I made an experiment locally using stack
, measuring the peak RSS with time -v
:
- LTS 11.1 (GHC 8.2.2/Haddock 2.18.1): Haddock uses 3.1GB maximum RSS
- nightly-2018-03-22 (GHC 8.4.1/Haddock 2.19.0): Haddock uses 4.4GB maximum RSS
This is a 42% regression, forcing me to switch off Haddock generation for GHC >= 8.4.1, which is a bit unfortunate. Fun fact: Given enough RAM, the actual use/system/wall clock time remains basically constant across the versions. The generated HTML is about 138MB.
I'm not sure if the regression is caused by Haddock itself or the GHC used to compile Haddock, but I thought reporting it here first might be a good idea.
Thanks for the data point. That doesn't sound too good.
In the light of this ticket I made https://github.com/haskell/haddock/pull/785 to measure the cost of each haddock phase for OpenGLRaw
.
Here are two interesting data points:
*** attachInstances:
attachInstances: alloc=5443993440 time=8645.793
and
*** ppHtml:
ppHtml: alloc=24836718280 time=38637.774
We are spending quite some time in attachInstances
and ppHtml
and both allocate a considerable amount of memory.
Pinging @harpocrates, maybe you have an idea what might be causing so much memory usage? I
I'm unfortunately not surprised. The the change to attachInstances
that likely caused this regression is https://github.com/haskell/haddock/pull/724. The last version of Haddock was not looking for instances in enough places, which led to some instances not showing up under the data types on which they were defined. (At the bottom of the PR you'll see some of the places this was showing up.)
The reason for this is that we weren't loading all the interfaces where instances might be defined. The PR I linked fixed exactly that: it led to a whole bunch more interfaces being loaded, which led to all the missing instances coming back. That said, because we are loading a whole bunch more interfaces, things also get slower.
There is probably followup work that could be done to be more intelligent about looking in just the right places for interfaces - all I did was replicate what :info
does for GHCi.
... That said, because we are loading a whole bunch more interfaces, things also get slower. ...
Just to repeat: In my case, there was basically no speed difference if there is enough memory available, so this isn't really an issue. It is the exorbitant memory usage which make Haddock unusable on Travis-CI (and other VMs with only a few GB of RAM).
Well I still think the increased memory usage is still probably due to us loading more interfaces than before. I'm not sure what clever tricks we can do to fix this. Going back to having instances missing sounds worse.
One thing that might help on your side is trying to limit the number of modules with orphan instances (or limiting import of modules with orphan instances). I'd have to look at the code again though. I'll try to get around to this sometime in the next week.
From the POV of OpenGLRaw, there are no orphans: OpenGLRaw defines neither classes nor instances. I don't know (and shouldn't know) if there are orphans in the packages OpenGLRaw depends on, and even if there were: There is absolutely nothing I could do about that.
I would be happy to trade time for space, e.g. by having some Haddock command line flag which increases runtime, but reduces peak memory usage (e.g. by reading/forgetting interface files repeatedly or something like that).
Why is it necessary to keep all interface files in memory at once? Should it not be possible to use much less memory by retaining only the information that is needed from each one?
I suspect it's merely the case of "patches welcome" and not something more fundamental.
This situation is getting to be quite critical; at this point we can no longer build Haddocks for the ghc
library on 32-bit Windows without blowing through the 2 GB address space limit.
Here is the GHC 8.6.1-alpha2 producing Haddocks for the ghc
library (with +RTS -pa -hc -RTS
):
And here it is with deddced31cabadf62fe01fff77b094cd005e52a1 reverted:
Given such a drastic difference, it looks like there is potentially a lot to gain by trying out @mpickering's suggestion:
Why is it necessary to keep all interface files in memory at once? Should it not be possible to use much less memory by retaining only the information that is needed from each one?
If everything goes well, this should be entirely a GHC-side optimization of the getNameToInstancesIndex
function. I'll try it out.
For now, my nix builds need haddock deactivating if I am to run anything else at all on my 16GB machine, or I get freezes and oomkills everywhere. Has there been any more progress made?
Some progress is happening in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6224. That patch should mean that Haddock doesn't need to use the GHC plugin interface anymore, which hopefully should reduce the amount of memory used by Haddock to be closer to what GHC uses itself (although we'll really have to wait, see, and profile).
I have the vague feeling that something somewhere recently caused a regression in this regard. I have an unchanged project which with recent nixpkgs now needs 8GB of RAM to build documentation.
Edit: Especially: This is not a slow growth, RAM spikes only when rendering some modules and 80% get used up in the one module where the OOM killer kicks in.
I'm closing this venerable ticket, let's resume the discussion on Gitlab and deal with regressions as they arrive.