cargo-semver-checks
cargo-semver-checks copied to clipboard
Check if using `simd-json` to parse rustdoc produces any speedup
Many of the rustdoc JSON files we parse are large — from a few MB to ~500MB in size. In the largest cases, we spend ~5s parsing JSON per cargo-semver-checks
run.
Speeding up JSON parsing by switching from serde_json
to simd-json
might be an easy win. Let's check!
- [ ] rewrite the JSON loading code to use
simd-json
- [ ] benchmark (on your own machine) the perf difference of loading rustdoc JSON of a large crate like
aws-sdk-ec2
- to generate rustdoc JSON, clone that repo,
cd
into the crate's directory, and use theRUSTDOCFLAGS="-Z unstable-options --document-private-items --document-hidden-items --output-format=json --cap-lints=allow" cargo doc --lib --no-deps
shell command which will putaws-sdk-ec2.json
into thetarget/doc
directory of the workspace - consider using tools like
divan
orcriterion
to ensure the perf delta is real and not a measurement artifact
- to generate rustdoc JSON, clone that repo,
- [ ] smaller rustdoc JSON file loading shouldn't regress either — regular
cargo test
loads dozens of small JSON files, so you can check those - [ ] bonus: as a separate PR, produce a benchmark of sufficient quality that we could include it in the repo itself, so we can reuse it for future optimizations
This is a good first issue for someone who is already familiar with Rust and wants to start contributing to this project. It might not be a good first issue for folks new to Rust.