hackage-server icon indicating copy to clipboard operation
hackage-server copied to clipboard

Module listings in the presence of Backpack

Open ezyang opened this issue 8 years ago • 9 comments

Let's talk about package index pages; you know, the thing you see when you navigate to a package page like https://hackage.haskell.org/package/time

Currently, Hackage duplicates the algorithm for rendering a module tree with Haddock. Here is Hackage's copy of ModuleForest: https://github.com/haskell/hackage-server/blob/master/Distribution/Server/Packages/ModuleForest.hs and here is Haddock's copy https://github.com/haskell/haddock/blob/master/haddock-api/src/Haddock/ModuleTree.hs Haddock generated its index from the list of modules that would have been passed to GHC to compile; Hackage generates it from a flattened package description. For the longest time, this didn't really cause any problems because the module tree format didn't ever really change (just read it off of exposed-modules).

Unfortunately, I recently added two new features to Cabal which have broken this assumption:

  • Reexported modules means that exposed modules are not limited to exposed-modules; they may also be defined in reexported-modules. This has meant that Hackage does not actually display reexported modules. To make matters worse, if reexported modules are implemented as links to the reexported module in another package (a reasonable strategy) this logic must be duplicated in both Haddock and Hackage. And the cherry on top: Hackage doesn't even have enough information to actually compute this correctly. More context at https://github.com/haskell/haddock/issues/563

  • Signatures can be inherited from packages, and we don't make it mandatory for a user to specify all signatures in their Cabal file (we can infer it when we are configuring the package.) But there will be Haddock documentation for every real signature in the package. Once again, Hackage has no way of figuring this out when it is generating its index.

I want to argue that it is morally wrong for Hackage to be in the business of generating the module tree that connects to the documentation in question. Not only does Hackage not have enough information to do it properly, Haddock is already in the business of computing it.

So, I think the correct model for Haddock to generate the HTML for the module tree, and then Hackage to transclude it into the index page, in much the same way Hackage transcludes the README into the index page.

Here's the annoying thing though: current Haddock, when invoked by Cabal to build documentation for Hackage, doesn't actually generate the index, because it is passed the --use-contents= flag which stops Haddock from generating the index. And even if it did generate the index, the HTML we are interested in is embedded inside a div inside the index HTML proper.

So, it would seem, to properly fix this, we would have to modify Haddock to generate a module tree which Hackage can transclude in. This is not great, since Haddock is on the same release cycle as GHC. We can always fall back on the old code if Haddock doesn't support this, but this means we can't really delete anything.

An alternative, pragmatic solution is to continue limping along with some less-than-perfect-but-good-enough module tree generation code in Hackage. For example, to get accurate signature listing, you'd simply be forced to list every signature explicitly in your Cabal file to help Cabal out. reexported-modules would remain a disaster for the forseeable future, or perhaps we can levy a Haddock level fix to copy in the documentation.

What do people think?

ezyang avatar Mar 21 '17 06:03 ezyang

"Haddock generated its index from the list of modules that would have been passed to GHC to compile; Hackage generates it from "

I think you accidentally a phrase.

gbaz avatar Mar 21 '17 07:03 gbaz

Fixed thanks.

ezyang avatar Mar 21 '17 07:03 ezyang

So, I think the correct model for Haddock to generate the HTML for the module tree, and then Hackage to transclude it into the index page, in much the same way Hackage transcludes the README into the index page.

as already mentioned on IRC, here I disagree; I think Hackage should have full control about how to render the HTML for the documentation tree; it should instead get all it needs to know about the module tree structure from Haddock, so it can generate the HTML. Same goes for README's btw, which we render ourselves, so we have full control over CSS and what HTML features we support, and can make sure things seamlessly integrate with the rest of the package cover page styling.

PS: We also want to know an approximate module tree for when Haddock hasn't been able to build the docs yet (or fails to build them)

hvr avatar Mar 21 '17 07:03 hvr

An alternate possibility which solves the signatures problem is for us to probe the TarIndex of the documentation tarball to find all files which "look like" modules; in practice, files which are capitalized, to find the complete signatures list.

ezyang avatar Mar 21 '17 07:03 ezyang

In any case, a major concern is that the builder is in better shape than it used to be but should never be relied on to always work -- packages may be unbuildable for a variety of reasons that the builder can never help with.

So it would be wrong to not be able to even generate a module tree when the builder fails.

On the reexports thing I vote to just fix it in hackage. (see also: https://github.com/haskell/hackage-server/issues/63)

On the backpack inferred signature thing I can't follow the exact problem or solution being described.

gbaz avatar Mar 21 '17 07:03 gbaz

So it would be wrong to not be able to even generate a module tree when the builder fails.

Yes, we would have to fall back, so we wouldn't really get to delete any code.

ezyang avatar Mar 21 '17 07:03 ezyang

OK, here is my proposed patch: https://github.com/haskell/hackage-server/pull/575/commits/389d9a46096256ee6de08e42d1d083507a000573

ezyang avatar Mar 21 '17 07:03 ezyang

@ezyang I've merged your proposed patch, as to me it represented an incremental improvement over the current state. If a better approach/patch comes along we can upgrade... :-)

hvr avatar Jul 16 '17 20:07 hvr

This is an issue for the optics package as we rely heavily on reexported-modules (see https://github.com/well-typed/optics/issues/237). Looking at the corresponding pages on Hackage and Stackage I noticed that the latter omits reexported-modules entirely. While that isn't ideal either, it does avoid the long list of links with no docs.

Perhaps a better interim solution would be for Hackage to divide the "Modules" list into "Modules" and "Re-exported Modules"? Then the former would have docs and the latter would not.

adamgundry avatar May 20 '22 10:05 adamgundry