opam-doc
opam-doc copied to clipboard
High level view, why bin-doc, and a proposal
This turned into a wall of text, sorry.
I've been reading the code and would appreciate help with my mental model.
The way I see it:
opam-docproxies calls toocamlc(.opt), runingbin-docagainst the resulting.cmtfile.bin-docreads the.cmtfile, extracts the doc info, and stores it in a new (?)cmdformat.opam-doc-indexthen reads thecmdfiles and builds html from that usingCow.
That raises questions in my mind:
- Why not just grab the compile target at the
ocamlcproxy point and pass it on toocamldoc, utilizing existing tools/format? - Would
ocamldocfail with such "arbitrary" input? - Is this why we need to compile each package in the first place? To get at the
cmtfiles, which contain type information after all theincludes anopens are done? - Is feeding
cmtfiles toocamldocpossible?
Docs and opam in general
- How does this project relate to
build-docsections in'opam' filespec?
Proposal
Provide a custom generator for ocamldoc that spits out json (perfect for web), and give package authors the option to specify build-doc steps that must output a single json file in a known format (ex. { "package_name" : { "modules" : [...] }.)
Benefts:
- Alleviates the need for compiling the world just to get the docs.
- Gives package owners control over doc output for
opam. - Give us structured docs for the entire
opamrepo (eventually.) - Makes building unique doc sites much easier, since the json format is universally understood and accessible.
- Is more robust, as there is no garbage collected from tests/examples/etc.
- Is simpler, the code for a custom generator is trivial and more approachable, compared to the lexer, parser, custom format requirements of
bin-doc.
Cost:
- If
opam-doc-indexis to be used it will require a significant rewrite to consume json instead ofcmd. - Working code would be dropped, namely, all of
bin-doc.
Personally, I don't grow attached to code, and consider this a benefit of a simpler solution, others may disagree.
Anecdotally
To build the doc site that I want, I would have to build a cmd to json serializer of some sort. I would much prefer to build an ocamldoc-json custom generator, because that can be used outside of opam. In fact, I've started, and there is also this.
Any and all comments are very much appreciated. If I'm not making sense, tell me. =)
An update for posterity, based on conversation in #ocaml.
(Note: Some of this is fit for a wiki, but https://github.com/ocamllabs/opam-doc/wiki is inaccessible for some reason.)
The most important point that was made is this: bin-doc is meant to replace ocamldoc upstream.
ocamldoc has the following, difficult to resolve, issues:
- It does not support
-packing, meaning it cannot combine multiple packages into a single doc page. - It cannot handle module includes, meaning that
include Moduleconfusesocamldoc.
The last (both?) of these points are a result of ocamldoc trying to make its own sense out of the input source files, ignoring the work that the compiler does.
bin-doc, on the other hand, takes advantage of the type information that is output during the compile step (with -bin-anot enabled.)
With this in mind:
opam-index-docis tobin-doc, what a custom generator is toocamldoc.- The
cmdformat is a good thing. - Hence, something like
bin-doc-to-jsonis a valid way forward, as isopam-index-doc. These would be analogous to-html,-latex, etc. generators inocamldoc.
This addresses most of the questions that I raised before.
The only remaining question is the relationship between opam-doc and the build-doc section in an opam file.
What if we give package authors the option to specify build-doc steps that must output cmd files?
- The resulting
cmdfiles can then be used byopamto build doc pages (similarly toocamldocallowing for different output formats.) - As before, this gives developers finer control over what goes into the docs.
- As before, it alleviates some of the problem of having to create a separate compiler switch and rebuilding the world.
That is unless I'm still misunderstanding something. =)
bin-doc reads the .cmt file, extracts the doc info, and stores it in a new (?) cmd format.
It actually reads the .ml(i) file to extract the doc info. Currently the doc info is not in the .cmt(i) file. opam-doc then combines the .cmt(i) file with the .cmd(i) file to produce the documentation.
ocamldoc has the following, difficult to resolve, issues:
It does not support -packing, meaning it cannot combine multiple packages into a single doc page. It cannot handle module includes, meaning that include Module confuses ocamldoc
Another important issue is with producing fully cross-referenced documentation across all the packages in OPAM. Since ocamldoc just uses strings to handle cross-references it cannot differentiate between the Foo module in one package from the Foo module in another package. opam-doc on the other hand knows which module was actually linked to during compilation and can produce a correct reference.
The last (both?) of these points are a result of ocamldoc trying to make its own sense out of the input source files, ignoring the work that the compiler does.
Yes that's basically it. The compiler keeps a lot more information than it did when ocamldoc was originally written, most of ocamldoc's front-end is now obsolete.
Hence, something like bin-doc-to-json is a valid way forward, as is opam-index-doc. These would be analogous to -html, -latex, etc. generators in ocamldoc.
Yes, but I wouldn't rush to make something just yet. This is still very much a prototype and the final version will probably be a bit different. For a start, I'm currently thinking about putting the documentation info in the .cmt(i) files by default, and only using a separate .cmd file for alternative documentation (i.e. translations) and things which have no corresponding .cmt file (like tutorials -- another intended use case for this work).
It is also worth noting that when the work is upstreamed there will no longer be any need for a special compiler switch.