Overhaul of the Manual
In the last days, I've completely overhauled the jq manual. Some of the most important changes:
- I've regrouped the chapters. I tried to make it such that the chapters build on each other; for example, to explain basic filters such as
|, it is useful to know how to construct values first, and to understand paths such as.posts[0], it is useful to understand|first. - To help with finding the information, the TOC can now be expanded to show subsections of sections.
- I've annotated many filters with compatibility notes that document divergent behaviour for other implementations. Currently, this covers mostly my own jq interpreter jaq, but I expect this to be useful too for gojq.
- I've reworked the visual appearance of the man page, such that longer blocks (such as examples or notes) are visible as such.
- I've corrected bugs in the manual as I went along; for example, the behaviour of arithmetic-update assignments.
- I've also added a large number of new examples.
- I significantly reworked the sections on paths, definitions, and assignments. These are nearly rewrites.
All in all, this PR adds a lot of new information to the jq manual and tries to improve its structure.
To see it in all its glory, you need Pandoc >3.0. Go to jq/docs, run pipenv sync, then
pipenv run python3 build_website.py --root /output && python3 -m http.server
and open http://0.0.0.0:8000/output/manual in your browser.
Please let me know if I can do something to get this merged. This builds on and supersedes my previous PR (#3186).
Very impressive!
I followed your directions for viewing the documentation and would just point out that they assume 'markdown' is already installed.
Here is a batch of comments and suggestions based on a quick perusal.
(1) Re:
Note The booleans can be defined by: def true: 0 == 0; def false: 0 != 0;
This information box should be deleted, mainly because it isn't really true as can be seen by running:
jq -n 'def true: 0 == 0 | debug; true'
(Perhaps you meant 'might have been defined' but even that would be too much of a distraction here.)
(2) Re :
but in expressions such 1E1234567890,
Should read: ... such as 1E1234567890,
(3) Re: jq disposes of many builtin functions f
Should read: jq defines many ...
(4) Re: Parenthesis work as a grouping operator
Should read: Parentheses work ...
(Parenthesis is a singular noun.)
(5) Re: complex path
I think that the phrase "compound path" would be more suitable and less confusing here. (It's potentially confusing in part because the jq manual uses the phrase "complex assignment" appropriately, and the two concepts should be kept distinct.)
(6) Re:
These filters only produce the values true and false, and so are only useful for genuine Boolean operations,
Suggestion: omit the clause:
and so are only useful for genuine Boolean operations,
as it suggests there is something wrong about using and and or
with non-boolean arguments.
(7) Re: Note The expressions f and g and f or g ...
Suggestion: omit the box entirely.
(First, it's a big distraction; second, the point has been made elsewhere; third, the last line ("This yields twice true ...") is confusing. ("This yields true twice, thus certifying the equivalence." would be clearer.)
Thanks for your remarks, @pkoppstein --- I've integrated all of them in 471b16e.
I suppose that pipenv sync suffices to install the markdown dependency?
@01mf02 wrote:
I've integrated all of them
I hope that means you found them useful :-)
markdown
I just installed it.
Here's Batch #2:
- Re:
It is also possible to define functions in jq, which is used pervasively to define jq’s standard library. (In fact, many jq functions such as map and select are written in jq.)
The above is a bit ungrammatical. Also, is "standard library" the same as builtin.jq or does it mean all the defined functions, including builtin.jq? Anyway it might be best to go for brevity here, perhaps something along the lines of
It is also possible to define functions in jq itself. In fact, many of jq's built-in functions,
including map and select, are written in jq.
- Re:
So, there’s generally a cleaner way to solve most problems in jq than defining variables.
Comment: I realize this sentence sentence comes from older versions of the manual, but I think it's a good time to improve it (especially in light of the need for variables described in the "Defining Functions" subsection) or even retire it completely. After all, the first paragraph of this subsection already states that "variables aren’t usually necessary in order to use a value twice". How about combining it with the sentence that follows it, perhaps along the lines:
So, variables are often unnecessary and sometimes even best avoided, but jq does let you define variables using the syntax: f as $x.
- Re: def while1( ....
Comment: Introducing while1 here is confusing. As I understand it, you want to juxtapose the two defs for ease of comparison. One way to do this would be with "#" comments, but if you want to avoid additional commentary, I would suggest simply renaming "while1" to "while_defined_naively", or some such.
Also, these wide code blocks (the ones that typically cause a slider to appear)
make reading very difficult. In this particular case, the problem would
be solved if the two snippets were written using one of the "indentation" forms
of if then else end e.g.
if ... then ... else end
- Re: The filter a = b is equivalent to b as $x | a |= b.
My guess is that you meant b as $x | a |= $x here
otherwise it would be pointless at true and incorrect at worst (e.g. if b is input).
Also, should some caveat or caution be introduced about the assumptions behind the choice of the name of the variable?
- Re:
jaq uses a different approach than jq and gojq to run assignments, which does not construct compound paths during assignments.
There are a couple of English-language problems with this sentence. First, one would not normally say "run assignments" here; and secondly, the "which" is in apposition to "assignments" but is intended to refer back to jaq.
How about reworking the entire "Compatibility: note along the following lines:
jaq's approach to handling assignments is quite different from that of
jq and gojq. Specifically, jaq executes assignments without
constructing compound path expressions. This means that jaq does not
allow certain filters on the left-hand side of assignments, notably
f? and label $x | f. jaq’s approach is generally more
performant, but in certain scenarios, jaq and jq will produce different
results, in particular when using f |= empty. However, for the
examples in this section, jq and jaq yield the same outputs.
I have now also integrated all remarks from your second batch.
I hope that means you found them useful :-)
Yes, tremendously!
Also, these wide code blocks (the ones that typically cause a slider to appear) make reading very difficult.
Are you reading the jq manual on a phone? Because I typically do not see a slider appearing ... Anyway, I followed your suggestion and broke the code example into shorter lines.
My guess is that you meant
b as $x | a |= $xhere otherwise it would be pointless at true and incorrect at worst (e.g. ifbisinput). Also, should some caveat or caution be introduced about the assumptions behind the choice of the name of the variable?
There must be some evil spell around = and += that makes every attempt to define these operators erroneous. :)
Thank you for spotting this, I have corrected my mistake. I also clarified that $x must be fresh at this point.
There are a couple of English-language problems with this sentence.
Thank you again so much for taking the time to reformulate my not-so-native sentences. I highly appreciate it.
Batch #3 (actually just one comment!)
- Re:
When using input, it is generally necessary to invoke jq with the -n command-line option, otherwise the first entity will be lost.
At first blush, this gives the impression that the warning does not apply to inputs.
Also, since the first example immediately below shows the opposite case, it might be worthwhile coming
up with a more helpful phrasing, perhaps:
When using input and/or inputs, it is often necessary to invoke jq with the -n command-line option to avoid losing the first value in the input stream.
When using input, it is generally necessary to invoke jq with the -n command-line option, otherwise the first entity will be lost.
At first blush, this gives the impression that the warning does not apply to
inputs. Also, since the first example immediately below shows the opposite case, it might be worthwhile coming up with a more helpful phrasing, perhaps:When using
inputand/orinputs, it is often necessary to invoke jq with the -n command-line option to avoid losing the first value in the input stream.
Good catch. I have integrated your suggestion.
It is now also possible to generate a PDF version of the manual. For this, you need Typst (I use 0.11.1).
pandoc content/manual/manual.md --lua-filter filters/filter.lua -o manual.typ
typst c manual.typ
Here is the current output. The appearance is still a bit rough at some places (especially example tables) --- keep in mind that I have not applied any Typst-specific styling. With a bit of work, this could be made to look quite nice. Right now, this is more of a demo to show that the new manual enables a wide range of output formats with relatively little effort.
Sorry to spam you again, but: @wader, @nicowilliams, @itchyny, @pkoppstein, @emanuele6, may I ask you what the status is on this PR? I've invested quite a significant amount of time into this, and it would be nice to have at least some basic feedback (which @pkoppstein has already given). If you have concerns about certain parts of this PR, then I'd be happy to address these.
@wader wrote:
also i think personally i would rather merge early, as it includes so many improvements
I wholeheartedly agree. In particular, I hope that any potential revisions regarding gojq won't hold things up. Indeed, there's a lot to be said for waiting until the jq+jaq version of the manual has been merged before embarking on a gojq-related project. Perhaps that's something @itchyny might be willing to undertake?
Nice work! note that as i'm reading thru nearly all of the document I might have added comments about texts that are already present.
Sorry to answer a bit late --- I have been quite busy with other stuff, and did not have the time so far to integrate your comments.
Read/Edit directly in markdown is much nicer than yaml i would say
Yes, that's also why I did this change in the first place. :)
What to do with old versions of the manual? Keep as is? Try move things into one big manual with version compatibility section etc?
I would propose to keep the old versions of the manual up to (including) 1.7, and to start annotating the current version with compatibility notes for added/changed functions. That way, finding out which functions are available in older versions of jq should become much easier, because you have the information in the manual. (Nowadays, you would have to browse through all manuals from previous versions until you do not find a reference to a function anymore.)
How to get it merged? approval from 3 maintainers is enough? also i think personally i would rather merge early, as it includes so many improvements, and then fix up stuff in further PRs rather than keep iterating on the same PR for a long time. That would also maker it easier to work in parallell.
Of course, I do not have a say in this, but I wholeheartedly agree with this. It would be really great if some other maintainers would speak up here ...
Sorry to answer a bit late --- I have been quite busy with other stuff, and did not have the time so far to integrate your comments.
No worries! hope they are useful
I would propose to keep the old versions of the manual up to (including) 1.7, and to start annotating the current version with compatibility notes for added/changed functions. That way, finding out which functions are available in older versions of jq should become much easier, because you have the information in the manual. (Nowadays, you would have to browse through all manuals from previous versions until you do not find a reference to a function anymore.)
Yes that is a good idea i think. Would that be per implementation?
Of course, I do not have a say in this, but I wholeheartedly agree with this. It would be really great if some other maintainers would speak up here ...
Yeap would be great
OK, I have now integrated all relevant suggestions by @wader, and updated the man page / tests. From my point of view, this is now ready.
(I also integrated all changes to the manual in master since the opening of this PR.)
What do ppl think about this? think i'm happy about it. Merging this i think will unblock motivations to improve the docs even more and maybe also cleanup/simplify the docs generating processing (maybe generate html directly from md?)
@01mf02 could you do a rebase/regenerate to fix the conflicts?
@wader - Yes, long overdue.
@wader, it's done!
@01mf02 - A tiny enhancement request whenever you get around to it:
The very first line in the subsection on Modules currently reads:
jq has a library/module system. Modules are files whose names end in .jq.
I think it would be very helpful to add something to the effect that a module cannot have a "main" program. That should make it clear that not every .jq file is a valid jq module, while also making it clear why the word "module" is used.
@itchyny, thanks a lot for your improvement suggestions!
Example styles are lost; the Examples link should be in
var(--bs-secondary-color)and the outputs should be aligned each lines. The Run buttons lack the border andtarget="_blank"because they are links to the external website.
I adapted the run buttons and the examples links (although adapting the latter to precisely match the current manual would probably take quite a bit more time, just let me know if the current solution is an issue). Concerning the output alignment, I'm not sure whether I understand the issue, because outputs seem aligned to me:
Anchor link icons are lost, this is important for users to notice we have anchor links each sections and filters. Also, I want to keep the existing anchor links;
#recurseto the recurse filter section.
I added the anchor link icons. I also tried to preserve more of the existing anchor links; however, this is sometimes not possible where the structure between the old and the new version has changed too much. Still, much more links should match their old IDs now. (Including your example, #recurse.)
@01mf02 - A tiny enhancement request whenever you get around to it:
The very first line in the subsection on Modules currently reads:
jq has a library/module system. Modules are files whose names end in .jq.
I think it would be very helpful to add something to the effect that a module cannot have a "main" program. That should make it clear that not every .jq file is a valid jq module, while also making it clear why the word "module" is used.
Good suggestion. Let us do this once this PR is merged. I'm reluctant to make content changes at this point.
Thanks a lot for all your comments, @itchyny!
From @wader:
maybe generate html directly from md?
I think this is a very good idea, and a much more elegant solution than the current manual conversion of md -> yml -> html with the Python code. You could either (1) use a static site generator, like Jekyll (GitHub pages native) or Hugo (faster, more modern, bit easier to use), or (2) since pandoc is already used, just use that to convert everything from md to html and add that to the template. I think (1) is probably the easier approach. It has builtin support for TOC creation, templates, includes and more. I could create a POC for this if you want.
Another benefit would be that you can have a website repo with just the website code, that pulls it's content (the markdown files) from this repo on changes. This separates all markup and website code from this repo, while still keeping the actual documentation close to the code.
@yochem yeap would be great to look into such things after we get this PR merged, it's already quite large and dragged on for a long time 😅
Yes definitely haha! I was not suggesting to include that in this PR, just a suggestion for after that. Maybe would've been better to create a separate issue for it. Sorry for the noice!
I think this is a very good idea, and a much more elegant solution than the current manual conversion of md -> yml -> html with the Python code. You could either (1) use a static site generator, like Jekyll (GitHub pages native) or Hugo (faster, more modern, bit easier to use), or (2) since pandoc is already used, just use that to convert everything from md to html and add that to the template. I think (1) is probably the easier approach. It has builtin support for TOC creation, templates, includes and more. I could create a POC for this if you want.
Another benefit would be that you can have a website repo with just the website code, that pulls it's content (the markdown files) from this repo on changes. This separates all markup and website code from this repo, while still keeping the actual documentation close to the code.
I also agree with you, but the problem is that the manual for the older jq versions also need to be generated somehow. And they are in YAML. So while the current solution in this PR is a bit clumsy, it ensures backwards compatibility. If the older manuals were not around, then generating HTML from Markdown directly would become feasible.
May I ask what is the status of this PR? @itchyny, are you happy with my answers to your change requests?
Would be great to get this merged. Are there any blockers or concerns left?