mbe: tweak matcher allocation patterns for benchmark_expand_macro_rules

Open hanblack26 opened this issue 1 month ago • 1 comments

Issue Attempt to optimize rust-analyzer#7857 (“Optmize mbe::match_ for performance”).

Summary This PR makes small, localized changes to the mbe matcher to reduce unnecessary allocations and intermediate work in the macro matcher. Initializes with small capacities to avoid repeated allocations and shave off some time. Functionally, no behavior has been changed. The goal was to make the path of the NFA-based matcher cheaper.

Testing To test, I ran the mbe create tests before and after the change. Before: `running 7 tests test benchmark::benchmark_expand_macro_rules ... ok test benchmark::benchmark_parse_macro_rules ... ok test tests::unbalanced_brace ... ok test tests::token_mapping_smoke_test ... ok test tests::token_mapping_floats ... ok test tests::minus_belongs_to_literal ... ok test tests::expr_2021 ... ok

test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.02s After:running 7 tests test benchmark::benchmark_expand_macro_rules ... ok test benchmark::benchmark_parse_macro_rules ... ok test tests::unbalanced_brace ... ok test tests::token_mapping_smoke_test ... ok test tests::token_mapping_floats ... ok test tests::expr_2021 ... ok test tests::minus_belongs_to_literal ... ok

test result: ok. 7 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s `

All the unit tests and existing benchmark tests continue to pass.

Benchmarks To measure the impact on benchmark_expand_macro_rules, I ran the slow benchmark command before and after: `RUN_SLOW_TESTS=1 cargo test --release --package mbe -- benchmark::benchmark_expand_macro_rules --nocapture

Single-run before: `running 1 test test benchmark::benchmark_expand_macro_rules ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 6 filtered out; finished in 0.26s`

5 runs after the change: `=== Run 1 === mbe expand macro rules: 87.29ms, 0b test result: ... finished in 0.27s

=== Run 2 === mbe expand macro rules: 112.52ms, 0b test result: ... finished in 0.30s

=== Run 3 === mbe expand macro rules: 84.34ms, 0b test result: ... finished in 0.25s

=== Run 4 === mbe expand macro rules: 85.42ms, 0b test result: ... finished in 0.26s

=== Run 5 === mbe expand macro rules: 99.77ms, 0b test result: ... finished in 0.28s`

So across multiple runs after the change, the end-to-end time stays in the same 0.25–0.30s range as the original single-run measurement (0.26s). Within that noise window, this change does not appear to significantly regress the benchmark_expand_macro_rules microbenchmark. The main benefit is simplifying the allocation patterns in the matcher so that repeated traversals of binding trees incur fewer small allocations.

Dec 11 '25 01:12 hanblack26

:warning: Warning :warning:

There are issue links (such as #123) in the commit messages of the following commits. Please move them to the PR description, to avoid spamming the issues with references to the commit, and so this bot can automatically canonicalize them to avoid issues with subtree.
- b21e738e99d99216c0c2369b7f8705a8026be97f

Dec 11 '25 01:12 rustbot