alpaca
alpaca copied to clipboard
Doc strings
We need to be able to (eventually) output docs for modules (e.g. javadoc, godoc, etc). I'm fine with something like
The block comment immediately preceding the first definition of a function (or its type specification) is used as the documentation for it. The block comment immediately preceding the module declaration is used as the overall module documentation.
And maybe use markdown for formatting. Open questions to me are:
- listing parameter names and descriptions. I like this about javadoc and would prefer to have it in Alpaca as well.
- code blocks in triple back-ticks or drop in a little
<code>...</code>set of tags? - referring to variables with
@, typer checks to make sure the references are to good variables but this might depend a bit on what we decide with #133.
Would love more ideas/opinions on this.
I really like how Elm and Idris handle docstrings, especially the idrisexample syntax and markdown support.
I really like how Elm and Idris handle docstrings, especially the idrisexample syntax and markdown support.
Or look at Elixir, it has @doc strings that are markdown, can contain doctests (actual test code that is run at test time), etc...
I agree with @OvermindDL1, LFE also has docstrings (but we don't support doctests though)
I'm weary of heredocs encouraging overly verbose module files, but I suppose support does not equate to encouragement of verbosity. Maybe in a user guide or similar we could advocate against giant in-module docs. FWIW I recommend using the EDoc format internally for maximum compatibility. I was experimenting with that in lodox but have been tied up with other things lately.
@yurrriq is right, EDoc format should be used to ensure compatibility. Also keep in mind that documentation for alpaca code is probably going to be published in hexdocs.pm
Regarding verbose module files filled with documentation I think there are pros/cons. In Elixir there are modules that contain 5 lines of code and 50 lines of documentation, but that plays nice once you render the docs. I also find it useful to have some examples available.
An agreeable solution to me would be to add support for collapsing docstrings in alpaca-mode.
I would generally prefer block comments for docs rather than heredoc strings, I think it keeps things a little cleaner in general and I believe is still amenable to collapsing in alpaca-mode. Additionally doctests are something I'm a bit jealous of in both Rust and Elixir - I'd definitely like for code blocks in a doc comment/string to be type-checked and run as tests as well. I do like parts of EDoc (e.g. @param as in javadoc as well) but I suspect it will be limiting to adopt it entirely. Does Elixir just output edoc-compatible stuff or something different?
Something I'd like to consider - but have no idea how to go about it yet - is multi-lingual docs or some relatively simple way to internationalize them/support simpler translation efforts.
Could someone explain to me need for compatibility with EDoc? Elixir's ExDoc outputs HTML files, in my opinion produced documentation is more readable and comprehensible than one produced by EDoc. Anyway, I'm wondering what are the advantages.
FWIW I'd opt for markdown formatting too 🙂
EDIT: I've just realised it might be for generating docs of entire project, written in both Erlang and Alpaca.
Another interesting idea would be to embed documentation into compiled files (which is done by Elixir compiler), which would make docs accessible from within the repl (which is awesome). It's probably really long term idea, but might be worth considering.
@arkgil: I'm advocating using EDoc as the internal format. From that you can generate whatever HTML you want. I agree the default output is not so desirable.. The point of using that format is that we'd be completely compatible with existing Erlang docs and tooling, which would make, e.g. generating polyglot docs a breeze.
@j14159: Elixir uses its own doc records, something like #elixir_doc_v1{} which are not (immediately) compatible with EDoc. They then embed the docs in a chunk of the compiled .beam. We do something similar in LFE.
I suppose another approach would be to use a custom #alpaca_doc_v1{}-filled chunk and then write code to translate that to/from EDoc format. Perhaps then we can have our cake and eat it too.
@arkgil: Embedding docs in beam chunks is simple, so I'd say that's short-term long-hanging fruit. :smile:
PS One issue I ran into when working toward translating LFE docs to EDoc format was that we deal with docs on an application/module level (as in, we know about sibling modules in the parent app, etc) whereas EDoc seems to go about it on a per-file basis.
Potentially informative references:
Thanks @yurrriq ! I wasn't aware it's simple, thank you for the references 🙂
For reference, after Erlang/OTP 20 is out, I will send a proposal to OTP and the community to unify how documentation is shared across BEAM languages. The goal is to have a BEAM chunk that stores the documentation with some metadata. For example, you could think of a "Docs" chunk as a list of tuples where each entry looks like this (written in Erlang):
{{foo, 1}, <<"this is the documentation">>, [{line, 13}, ...]}
Somewhere we also need to store the format of the documentation (markdown, edoc, etc).
This chunk should replace the "LDoc" used by LFE and the "ExDc" chunk used by Elixir. If we can agree on the same chunk format, then we can make tools like ExDoc generate documentation regardless of the language. Functions for accessing documentation in the shell should also work across languages, regardless if they were written in Elixir, Alpaca or LFE.
The only issue is the documentation format. If Elixir docs are written in Markdown and a language does not know how to parse Markdown, then they will have to choose to either not show the Elixir docs or show them in a raw format (i.e. in Markdown). This works fine for Markdown but it would likely be painful if your documentation is written in XML. However, regardless of the format, if this new chunk is accepted, all languages should be capable of showing docs for themselves and at least Erlang.
This new chunk is almost fully orthogonal to this discussion except when it comes to storage. Elixir (and possibly LFE - please correct me @yurrriq) store the documentation the BEAM chunk at compilation time. This is easy in Elixir because the documentation is an annotation:
@doc "says hello world"
def hello_world do
IO.puts "hello world"
end
However, if you write the documentation in the same syntax as code comments, converting them into docs may be more complicated. If you want to do it during compilation time, you will need to parse the comments out in the alpaca parser. Otherwise you will need an explicit step which takes the documentation and embeds it into the .beam file (which will likely be the approach that needs to be taken in Erlang itself - since a lot of the documentation is in a separate XML file).
A third approach is the one done by Rust, which is a mix of both solutions. In Rust they use the code comments syntax plus an extra token to mark it as documentation. So in Rust:
/// Hello world
fn main ...
becomes:
#![doc = "Hello world"]
fn main ...
So in Rust you use the code comments syntax but they end-up embedded as an annotation, quite similar to Elixir.
FYI, I am working on a documentation viewer and extractor for the Erlang shell - docsh. This is a free time effort, though, so the pace is rather slow.
@josevalim's idea of a common chunk format for all BEAM languages is definitely the way to go - let's see how the OTP team responds. It would solve the problem of accessing docs across languages without hacks like this one to access docsh-generated documentation from IEx (where :beam_doc_provider is docsh_iex). Without a common format, we would need such a doc provider for each of ExDc, LDoc, Alpaca doc chunk, etc. On the other hand, the providers would give language authors freedom to choose the documentation format.
Regarding the format, I do not second @yurrriq's idea to make EDoc the default. My personal impression is that the leeway in its design (or evolution?) led to too many similar-yet-different @-tags, which end up not being used consistently. The most prominent examples are @type and @spec, now redundant due to the corresponding attributes. Moreover, even OTP itself is not documented completely in EDoc, but mostly in out of source XML. A lightweight Markdown based approach (e.g. Elixir's) seems much more convenient.
I like the BEAM chunk idea that @josevalim is suggesting, seems in line with the AST chunk work lately as well. I don't really have a problem with producing edoc-compatible stuff (or leaving that to a tool or some sort of plugin) but the issue with external docs in XML raised by @erszcz makes me second-guess the utility of it.
I'm pretty firm on wanting things like doc-tests but have discussed this with others (h/t @talentdeficit for ideas and inspiration) and have thoughts about linking or tagging tests in docs for outputting as examples in-line later rather than requiring in-line up front.
Things I don't know how to handle yet include but are not limited to:
- validating docs against code, e.g. types in docs have to agree with the code, or argument lists, etc. I don't know how this should work yet but I'd like to have it.
- documentation for different function heads. E.g. should we interleave as examples?
I'm also still pretty keen on markdown for docs in block comments but I think we'll need some sort of annotations in them to make them really useful.
I'm also still pretty keen on markdown for docs in block comments but I think we'll need some sort of annotations in them to make them really useful.
FWIW, Elixir has a single "extension" to markdown which is using backticks to provide autolinks. When generating the HTML documentation, Foo.bar/3 will automatically link to the function bar with arity 3 in module Foo. c:Foo.bar/3 links to a callback in the module Foo with name bar and arity 3. t:Foo.bar/2 links to the type bar/2 in Foo. Foo links to a module named Foo. bar/3 links to a local function named bar with arity 3.
documentation for different function heads
In Erlang, function heads are an implementation detail. I am not sure how much that holds in Alpaca since it is statically typed language. But imagine that you decided to change the implementation of a function and move the different clauses to a private function, would this force you to rewrite the docs? If so, is that a good thing or a bad thing? A lot of people object in writing documentation along the source code exactly because you may end-up coupling the two, so it is worth considering how and when code changes should affect the docs or not.
I'd honestly say that the 'format' of the doc (and thus you could embed multiple formats too in the BEAM files) should be a mimetype or some specific equivalent there-of.
In the short term, if we parse the docstrings and store in the AST, we can just write it out when we write out the type information in the AST into the module attributes. It would be easy enough to walk that structure to extract the comments straight from the compiled module. It is a little ugly doing it like this so having a dedicated chunk in the generated BEAM would be great if that gets adopted.
I'm broadly in favour of markdown and the simple approach of Elixir with autolinking backticks as @josevalim describes - we'll already have type information, whether inferred or specified via type signatures, so I don't know if we'd need specific kind of annotations in docstrings other than simple hotlinks.