cargo
cargo copied to clipboard
Add an unstable `--output-format` option to `cargo rustdoc`
Problem
rustdoc (on nightly) exposes an --output-format option to choose the format of the generated documentation: either HTML (the default) or JSON (the new unstable backend).
If you choose the JSON format, the output file of the unit is no longer /target/doc/<crate_name>/index.html; it becomes /target/doc/<crate name>.json.
Since there is no way to specify the output format as an option of cargo rustdoc itself, cargo assumes that the only possible output path for a doc unit is the HTML one. As a consequence, the JSON output is never considered "fresh" and we miss out on caching (see the test case in #12100, which fails on master).
Proposed Solution
Since cargo must remain unaware of the semantics of rustdoc's extra arguments, I suggest we add --output-format as an unstable cargo rustdoc argument, mimicking the semantics of the underlying --output-format option for rustdoc.
This would allow cargo to reason about it and take it into account in the fingerprinting logic.
Notes
No response
It might be good to list all the concerns that need to be considered for this. For example:
- What happens with
--open? Does that generate an error? - Is there any interaction with
-Z rustdoc-scrape-examples? Does the JSON include the scraped data? Or should scraping be disabled when generating the JSON? - This mentions adding it to
cargo rustdoc, would it also be added tocargo doc? If not, is there a reason?- If we do add this to
cargo doc, would it also generate JSON files for all the dependencies?
- If we do add this to
Overall seems like a reasonable flag to add to me.
@obi1kenobi since you rely on rustdoc's output format, I'm curious on your thoughts on this issue.
Thanks for raising those points @ehuss, I'll try to address them based on what I know.
* What happens with `--open`? Does that generate an error?
I don't think it's necessary to return an error, at least for the JSON format. Most browsers have native support for rich viewers when it comes to JSON documents, and I do find mysel opening the JSON docs in the browser from time to time for exploration. I suggest we keep honouring the current behaviour, even if the format is JSON.
* Is there any interaction with [`-Z rustdoc-scrape-examples`](https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#scrape-examples)? Does the JSON include the scraped data? Or should scraping be disabled when generating the JSON?
I'm not familiar with the feature, nor with its implications for the JSON format. @aDotInTheVoid perhaps?
* This mentions adding it to `cargo rustdoc`, would it also be added to `cargo doc`? If not, is there a reason? * If we do add this to `cargo doc`, would it also generate JSON files for all the dependencies?
Yes, adding it to cargo doc is the natural extension and I do expect it to behave as you describe—generate the JSON docs for all dependencies.
There are some edge cases to keep in mind in that case (i.e. the JSON filename is the crate name, leading to issues if you have multiple versions of the same crate in your dependency tree), but they must probably must resolved regardless.
I have limited this issue to cargo rustdoc since that is the primary interface at the moment and it looks like a reasonable first step.
Thanks for the ping @epage. I'm strongly in favor of this. It's a reasonable feature to have, and from my perspective, improving the rustdoc JSON cache hit rate is extremely worth it. Thanks to @LukeMathWalker for figuring this out.
Anything that improves rustdoc JSON performance would be hugely impactful for cargo-semver-checks, since that's by far the "long pole" in its perf graph right now.
For context, cargo-semver-checks v0.20 shipped a massive perf improvement: 109x faster checking time in tokio, 2354x in aws-sdk-ec2, etc. We've also added caching of the "baseline" rustdoc JSON which gives us another 2x speedup on top of those numbers.
Since we've significantly reduced the time taken by all the other steps, the "current" version's rustdoc JSON generation is easily 70-80% of cargo-semver-checks wall-clock time right now. In turn, our users report that cargo-semver-checks even after the optimizations can be up to 50% of their total CI time
the JSON filename is the crate name
This is ever so slightly imprecise, the filename is the lib target's name with all - chars turned into _. This bit cargo-semver-checks recently so I'm bringing it up so it doesn't bite anyone else.
-Z rustdoc-scrape-examples
I believe the JSON does not have an ability to include examples. It only includes doc comments that are explicitly set on items. For the time being, scraping can be disabled. I have no opinion at the moment on whether rustdoc JSON should include examples, and I think it's fine to decide on that later since the JSON format is unstable anyway.
--open
I'm fine with attempting to open the JSON file with the user's default viewer (mine would be VSCode), falling back to the browser if no default viewer is available. If this is complex to achieve on day 1, I'm also fine with this being an error for now.
cargo doc
Totally fine with adding it there too, it makes sense to me.
Is there any concern here that remains to be addressed? Or should we start sketching the implementation?
Most concerns seem to be answered. I got some minor questions here: How stable is the rustdoc JSON output? Does it make sense that a user wants to keep different versions of JSON output at the same time?
Apart from my trivial questions, I feel like the flag isn't something that cannot be rolled back, and It also won't affect major build workflow of Cargo. Thus I am happy to discuss the design and implementation.
How stable is the rustdoc JSON output? Does it make sense that a user wants to keep different versions of JSON output at the same time?
It is versioned (via the format_version field), but a specific rustdoc build can only emit a single version of the output.
Supporting multiple versions is primarily a "client-side" concern, if you want your tool to be able to work across multiple nightly releases (e.g. what @obi1kenobi does for cargo-semver-checks).
Let's do a quick poll before starting.
@rfcbot fcp merge
Proposed solution
Since
cargomust remain unaware of the semantics ofrustdoc's extra arguments, I suggest we add--output-formatas an unstablecargo rustdocargument, mimicking the semantics of the underlying--output-formatoption forrustdoc.This would allow
cargoto reason about it and take it into account in the fingerprinting logic.
Team member @weihanglo has proposed to merge this. The next step is review by the rest of the tagged team members:
- [x] @Eh2406
- [x] @Muscraft
- [ ] @arlosi
- [x] @ehuss
- [x] @epage
- [ ] @joshtriplett
- [x] @weihanglo
No concerns currently listed.
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
So long as this is unstable, I have no concern with this. Once we have it,. it'll then also make it easier to explore the benefits and challenges with making it stable (assuming the rustdoc flag is stable)
:bell: This is now entering its final comment period, as per the review above. :bell:
There is no need to wait util FCP ends. People interested in doing it can start. The Cargo Team may not have a large capacity to mentor but for the review part we'll try our best.
I'll get started next week then 👍🏻 Thanks!
The final comment period, with a disposition to merge, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
This will be merged soon.
I'll get started next week then 👍🏻 Thanks!
Hi @LukeMathWalker , were you able to start working on this?
No, a few other things had to take priority.
No, a few other things had to take priority.
Can other folks(including me) give it a try? Else I believe it's better to assign it to yourself.
If you have the time to start on it immediately, go ahead 😁
@rustbot claim
@obi1kenobi @LukeMathWalker
The --output-format flag for cargo rustdoc is available on nightly. It would be much appreciated if you can provide feedback on the cache story, or other UX issues :)