dvc.org
dvc.org copied to clipboard
ref: automatic generation
There have been several issues regarding inconsistencies between the information provided by dvc <command> -h and the sections Options / Synopsis of the docs.
I think that it would make sense to autogenerate some parts of the ~~API/~~ Command Reference directly from dvc to ensure consistency and reduce manual edition (editing the usage section is kind of tedious) and duplication (i.e. many times when a new option is added in dvc core, the docs P.R. contains just a copy-paste) .
I think that we could autogenerate the markdown for the Options/Synopsis sections using some placeholders.
So the actual .md page would look like:
`usage: dvc diff`
`args: dvc diff`
Would generate:
## Synopsis
usage: dvc diff [-h] [-q | -v]
[--targets [<paths> [<paths> ...]]]
[--show-json] [--show-hash] [--show-md]
[a_rev] [b_rev]
positional arguments:
a_rev Old Git commit to compare (defaults to HEAD)
b_rev New Git commit to compare (defaults to current workspace)
## Options
- `--targets <paths>` - specific DVC-tracked files to compare.
When specifying arguments for `--targets` before `a_rev`/`b_rev`, you should
use `--` after this option's arguments (POSIX terminals), e.g.:
$ dvc diff --targets t1.json t2.yaml -- HEAD v1
- `--show-json` - prints the command's output in easily parsable JSON format,
instead of a human-readable table.
- `--show-md` - prints the command's output in Markdown table format.
- `--show-hash` - print file and directory hash values along with their path.
Useful for debug purposes.
- `--hide-missing` - do not list data missing from both workspace and cache
(`not in cache`). Only list files and directories which have been explicitly
added, modified, or deleted. This option does nothing when comparing two Git
commits.
- `-h`, `--help` - prints the usage/help message, and exit.
- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
problems arise, otherwise 1.
- `-v`, `--verbose` - displays detailed tracing information.
Additional info that is not covered in dvc <command> -h would be added in separated sections like Description or Examples.
In addition, for Python API, we could use the docstrings content and it could be considered a way of "enforcing" detailed and up-to-date docstrings on the core repo, which are very helpful when using the API inside code editors.
UPDATE: See https://github.com/iterative/dvc.org/issues/3595 for this.
In #2108 we discussed this in January. I also believe there must be some connection between the command reference and the command output, but it shouldn't be automated that way.
~~should we close this as duplicate of #2108?~~
summary from discussion (in #2108 etc.):
- CLI
--helpis brief - https://dvc.org/doc/command-reference is meant to have some more details, edge cases, explanation
- https://dvc.org/doc/ in general (e.g.
user-guide) contains use cases etc.
So (1) & (2) can't be identical. There's a subset/subsection of (2) which could be auto-generated from (1).
Maybe we should do this. Or as per #2207 have a CI check. But maybe too much effort. In any case we could also start collecting examples of where things are currently broken and would benefit from the effort.
Related: https://github.com/iterative/dvc/issues/5392#issuecomment-869664566
In any case we could also start collecting examples of where things are currently broken and would benefit from the effort.
Might we add a new label to identify these cases (i.e. cli-docs-discrepancy) or use an existing one (dvc-update) ??
To get started I added dvc-update to some P.R.s I was aware of: https://github.com/iterative/dvc.org/pull/2743 https://github.com/iterative/dvc.org/pull/2762
Might we add a new label to identify these cases
@daavoo or just list them in the description of this issue as checkboxes and label this epic.
From #3219 :
Maybe the exp show table output is a different case to the ones previously, but it is especially tedious to manually update in order to keep in sync with DVC status.
I think we may need a hand from the website team for dvcauto and dvcautotable code blocks. These will be just like dvc and dvctable, but they will not show the initial $ ... line and we'll run that line to fill this section on demand. It requires some other type of markup to "generate output when requested."
e.g.
```dvcautotable
$ dvc exp show
<table output>
```
Only the table output will be shown on the HTML output, and we can write a script to run the first line and update these blocks. This removes the runtime dependency between dvc and gatsby.
One big consideration particularly with exp show over the --help pages is that exp show requires a repo for the command to be run against, which will likely need to be different for a few tables.
#3219 handled this by having two functions/scripts that cloned different repos, example-get-started and example-dvc-experiments, but that's not as easy to translate into the code block format that takes a command on the first line.
Could we do something like:
```dvctableoutput
$ git clone https://github.com/iterative/example-dvc-experiments /tmp/ede
$ cd /tmp/ede
$ dvc exp show --no-pager --json | custom-script-to-draw-table.py
<<<colored table>>>
```
The first lines with $ won't be shown in the HTML output. Only the remaining part with table (or output) will be shown.
Then, we'll write something like @daavoo did to run the lines starting with $ and update this block with the commands and output. This script can be run when necessary, not on every build, and we won't update the pages during the build. We'll check in commands and outputs to Git, and Gatsby won't need to run the scripts. It's fairly general way to embed command outputs to pages.
It can have two flavors, one derived from dvc that shows the command output as is, one from dvctable that shows the colored tables.
Ideally we could develop a custom Prism highlighter indeed (dvctable) but hopefully no need to run any commands or clone repos, just copy-paste the output and it colors it automatically.
p.s. I'm not sure how that relates to auto-generating the cmd ref specs though, which is what I understand from this issue.
p.s. I'm not sure how that relates to auto-generating the cmd ref specs though, which is what I understand from this issue.
It relates in the sense that I wanted to avoid just copy-paste the output (and also simplify how the get the concrete output, which currently is not straightforward ) process when updating the cmd refs involving exp show tables.
But those table samples are mainly in cmd ref Examples and in Guides I think. Maybe there's one in the Description of the exp show ref itself, but even that's not a section we can automate, right? I thought that in this issue we were talking about auto-generating the Synopsis and Options sections, basically.
You are right, the dvctable automation could be better discussed in a separated issue
maybe too much effort. In any case we could also start collecting examples of where things are currently broken and would benefit from the effort
Unless we can generate this I don't think this issue is actionable.