Comonicon.jl
Comonicon.jl copied to clipboard
Proposal: Improved docstring handling
As described in #212, I think the current state of docstring parsing could benefit from some TLC. I see there's been some effort with the (undocumented) # Intro/# Introduction section, however this isn't enough.
I propose that docstrings be handled the following way:
- The "brief description" is simply taken to be the first paragraph of the docstring. e.g.
The brief is this bit But not this bit - Special/known sections (arguments, options, etc.) be extracted, as they are now
- All unrecognized sections/content be put in a big descriptions object
When called/rendered
- The description should be shown, then the special/known sections
- Markdown paragraphs should be reflowed (as
splitlinescurrently does), ideally not counting control sequences as part of the line length (see tecosaur/Org.jl:src/render/formatting.jl#L28-L102 as an example, there's also an equivalent function in stdlib/Markdown:src/render/terminal/formatting.jl but it simply isn't as good) - Ideally, some other markdown objects should be formatted in a syntax-sensitive manner, e.g. wrapped items should be indented appropriately, but this is very much a lower priority compared to the rest
- Remaining markdown objects (e.g. code blocks) should be inserted verbatim, without modification
What do you think?
The "brief description" is simply taken to be the first paragraph of the docstring. e.g
What's the difference between this and the Intro section approach? I think the reason I chose to do this instead of the first paragraph (actually it was the first sentence of the description) is to be more explicit.
Markdown paragraphs should be reflowed (as splitlines currently does), ideally not counting control sequences as part of the line length (see tecosaur/Org.jl:src/render/formatting.jl#L28-L102 as an example, there's also an equivalent function in stdlib/Markdown:src/render/terminal/formatting.jl but it simply isn't as good)
deally, some other markdown objects should be formatted in a syntax-sensitive manner, e.g. wrapped items should be indented appropriately, but this is very much a lower priority compared to the rest
Yeah, I have to admit there haven't been many efforts in the help page formatter of Comonicon (it just re-uses whatever Julia's Markdown module provides), it would be nice to have something more specific for the CLI help page (or even man page). I was too lazy to write it but yes, ideally we should also include the markdown AST in the Description node, and pass it to code generators.
What's the difference between this and the Intro section approach? I think the reason I chose to do this instead of the first paragraph (actually it was the first sentence of the description) is to be more explicit.
Ah, I may have got the behavior of Intro slightly mixed up. I thought it sets the long description?
If Intro does set the brief and not the long description, and only supports a single paragraph, isn't that just equivalent to grabbing the first paragraph with some extra syntax?
I'd say the advantage of not bothering with Intro in such a case is that it is (I think, at least) a fairly intuitive behavior, and one less thing the user needs to read about.
Reading the first sentence of the description sounds a bit dodgy, considering that you could have a sub-command like jekyll - Expose Mr. Hyde which would be naively split at Mr.. By comparison, I don't anticipate any such issues with the first paragraph.
Yeah, I have to admit there haven't been many efforts in the help page formatter of Comonicon (it just re-uses whatever Julia's Markdown module provides), it would be nice to have something more specific for the CLI help page (or even man page). I was too lazy to write it but yes, ideally we should also include the markdown AST in the Description node, and pass it to code generators.
Thinking more about this, I think you've actually done too much work perhaps. The Markdown stdlib already supports reflowing the text to fit a certain width, so you could just use IOContext to set the appropriate width + 2 and strip the 2 space margin from the start of each line.
Ah, I may have got the behavior of Intro slightly mixed up. I thought it sets the long description?
Sorry, I forgot that convention is actually as follows. Intro is for long description and the short description is the first paragraph that has no section header, e.g
https://github.com/comonicon/Comonicon.jl/blob/master/test/frontend/markdown.jl#L31
Thinking more about this, I think you've actually done too much work perhaps. The Markdown stdlib already supports reflowing the text to fit a certain width, so you could just use IOContext to set the appropriate width + 2 and strip the 2 space margin from the start of each line.
I don't think so? https://github.com/JuliaLang/julia/blob/master/stdlib/Markdown/src/render/plain.jl#L116
So I think the current behaviour is actually already what you want, but need to document it better, I have to admit the documentation has been a bit out of date... and need some more work to make it more understandable
I'd say the advantage of not bothering with Intro in such a case is that it is (I think, at least) a fairly intuitive behavior, and one less thing the user needs to read about.
I have some further thoughts on this, I think
- this does not necessarily reduce the knowledge user needs to learn, you still need to know that you have to write a second paragraph, so it has no difference between doing an
Introsection. - this makes things less explicit, which is what I'm against. I'd rather users know what they are doing than learning less.
So I'll close this issue after having more documentation mentioning the docstring syntax.
OK I added more description about this in the convention section now.
I have some further thoughts on this, I think
- this does not necessarily reduce the knowledge user needs to learn, you still need to know that you have to write a second paragraph, so it has no difference between doing an Intro section.
If you're going to write an expanded description, you're going to write a second paragraph naturally. Without reading any documentation I'd think many people would write something like this (I know I have):
Subcommand that does something.
Some longer description of what this subcommand exists and
the precise nature of it's behaviour, perhaps also
mentioning or even discussing how it interacts with other
subcommands.
A large part of what's nice about Comonicon is implicit information/specification, such as being able to do:
# Options
- `-o, --opt`: description
Instead of being more explicit with something like:
Option(shortform="-o", longform="--opt", description="...")
Also, with # Intro it seems to shout at you if you try to have more than one paragraph? This is a strange restriction...
ERROR: section Intro/Introduction can only have one paragraph
If you're going to write an expanded description, you're going to write a second paragraph naturally. Without reading any documentation I'd think many people would write something like this (I know I have):
This is hard to decide as I don't want people not aware of what effects it has in their produced CLI, but I agree to separate these into paragraphs is more natural. Myabe I'll just leave this issue open for now and see if there is anyone else want to comment.
I also lean towards the first/second paragraph division, which is what is usually done in documentation for Python. I would leave the "Intro" section for more detailed docstrings maybe (?)