toml
toml copied to clipboard
`toml_edit` is pretty slow to compile
toml_edit
is pretty slow to compile, one of the slowest dependencies of cargo
. My suspicion is the number of macros for combine
is the core problem (due to the extra parsing steps the compiler has to do). I want to explore using nom
once some ergonomic improvements are made since it doesn't use macros.
Originally posted by @epage in https://github.com/ordian/toml_edit/issues/323#issuecomment-1182006418
It's funny I migrated the parser from nom to combine a while ago (#26) because of too many macros. Combine was mainly based on traits at the time. It would be ironic to go full circle.
Did some profiling:
$ cargo +nightly rustc -- -Z self-profile
$ summarize summarize toml_edit-554834.mm_profdata | head -20
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| Item | Self time | % of total time | Time | Item count | Incremental result hashing time |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_passes | 17.92s | 18.750 | 17.95s | 1 | 0.00ns |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| finish_ongoing_codegen | 16.27s | 17.022 | 16.28s | 1 | 0.00ns |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_module_codegen_emit_obj | 16.20s | 16.949 | 16.20s | 121 | 0.00ns |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_lto_optimize | 13.10s | 13.699 | 13.10s | 121 | 0.00ns |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_module_optimize | 12.63s | 13.216 | 12.63s | 121 | 0.00ns |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| LLVM_thin_lto_import | 3.39s | 3.545 | 3.39s | 121 | 0.00ns |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| typeck | 3.15s | 3.296 | 6.77s | 1516 | 10.39ms |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| codegen_module_optimize | 1.86s | 1.949 | 14.50s | 121 | 0.00ns |
+--------------------------------------------------+-----------+-----------------+----------+------------+---------------------------------+
| codegen_module_perform_lto | 1.86s | 1.948 | 35.04s | 121 | 0.00ns |
It spends a lot of time with monomorphization and linking which suggests generics explosion.
❯ cargo llvm-lines | head -20
Lines Copies Function name
----- ------ -------------
884919 (100%) 15631 (100%) (TOTAL)
81961 (9.3%) 120 (0.8%) combine::parser::sequence::<impl combine::parser::Parser<Input> for (A,B)>::parse_mode_impl
59663 (6.7%) 79 (0.5%) <combine::parser::combinator::AndThen<P,F> as combine::parser::Parser<Input>>::parse_mode_impl
56629 (6.4%) 59 (0.4%) <(Y,Z) as combine::parser::choice::ChoiceParser<Input>>::parse_mode_choice
51181 (5.8%) 335 (2.1%) <combine::parser::combinator::Map<P,F> as combine::parser::Parser<Input>>::parse_mode_impl
42252 (4.8%) 37 (0.2%) combine::parser::sequence::<impl combine::parser::Parser<Input> for (A,B,C)>::parse_mode_impl
31767 (3.6%) 256 (1.6%) combine::error::ParseResult<T,E>::map
26308 (3.0%) 62 (0.4%) combine::parser::ParseMode::parse_committed
25442 (2.9%) 50 (0.3%) <combine::parser::range::RecognizeWithValue<P> as combine::parser::Parser<Input>>::parse_mode
21348 (2.4%) 68 (0.4%) <combine::parser::repeat::Iter<Input,P,S,M> as core::iter::traits::iterator::Iterator>::next
20884 (2.4%) 1402 (9.0%) combine::parser::Parser::parse_mode
14444 (1.6%) 628 (4.0%) <combine::parser::PartialMode as combine::parser::ParseMode>::parse
14369 (1.6%) 67 (0.4%) combine::parser::sequence::PartialState2<A,B>::add_errors
12562 (1.4%) 17 (0.1%) combine::parser::range::parse_partial_range
12328 (1.4%) 134 (0.9%) combine::parser::sequence::PartialState2<A,B>::add_errors::{{closure}}
11365 (1.3%) 6 (0.0%) <(V,X,Y,Z) as combine::parser::choice::ChoiceParser<Input>>::parse_mode_choice
11342 (1.3%) 8 (0.1%) <(X,Y,Z) as combine::parser::choice::ChoiceParser<Input>>::parse_mode_choice
10836 (1.2%) 774 (5.0%) <combine::parser::FirstMode as combine::parser::ParseMode>::parse
I'd say rewriting the parser manually is still on the table and might be beneficial long-term (e.g. precise control over errors). Short-term we could try to reduce the number of macros and generics in toml_edit itself.
It's funny I migrated the parser from nom to combine a while ago (https://github.com/ordian/toml_edit/pull/26) because of too many macros. Combine was mainly based on traits at the time. It would be ironic to go full circle.
And combine
can still be light on macros except in some specific cases.
Short-term we could try to reduce the number of macros and generics in toml_edit itself.
I had looked at trying to drop the macros where possible but combine's API is such a mess to work with because everything is a combinator, returning a generic parser, rather than nom's model of operating on the actual parsers themselves (which also simplifies some parser cases in my experience).
I'd say rewriting the parser manually is still on the table and might be beneficial long-term (e.g. precise control over errors).
Its interesting to see what perspectives exist in the community. I've heard from some that the use of a parser library in toml_edit made it more favorable to toml-rs. I've also seen others say to just write a parser by hand. I lean towards preferring parser libraries as I feel it makes the parsing code more maintainable unless there is a very specific and critical need otherwise.
I had looked at trying to drop the macros where possible but combine's API is such a mess to work with because everything is a combinator, returning a generic parser, rather than nom's model of operating on the actual parsers themselves (which also simplifies some parser cases in my experience).
Fair point, I haven't looked at the current state of nom. Feel free to explore this approach.
Its interesting to see what perspectives exist in the community. I've heard from some that the use of a parser library in toml_edit made it more favorable to toml-rs. I've also seen others say to just write a parser by hand. I lean towards preferring parser libraries as I feel it makes the parsing code more maintainable unless there is a very specific and critical need otherwise.
Interesting indeed. I do agree that using parser libraries result in more readable and maintainable code. The downside is that we lose precise control. Control over compile times, control over error messages and when things don't work as expected we're at the mercy of the library developers if they are complex enough. Another concern is security/vulnerability response.
That being said, I'm not sure whether the problem is with combine or how we (ab)use it (probably a combination of both). But if you think nom's approach will have fewer downsides, go for it :)
I want to explore using nom once some ergonomic improvements are made since it doesn't use macros.
As a follow up to my earlier comment, like most parser libraries, the development of nom has slowed down, so we do not yet have these ergonomic improvements. I am coordinator with the maintainer for getting access to make these improvements. They will be back from vacation soon which will hopefully get things moving forward.
btw I had forgotten to call out that I made a parser benchmark repo. This was in part to test my combine theory. While combine performs fairly well, that is for an old version of combine (all I could find a json implementation for) and only uses one macro rather than our 2 layers of macros for every parser approach.
Also, for more background on why I had originally suspected combine and our use of it: When nnethercote was analyzing hot spots for compiling, he found that tt munchers were slow to compile because it needed to re-parse at each layer of recursion to check what pattern it matched. While we aren't doing tt munching, we put the entire body of every parser function into two layers of macros.
However, if this is all done during typecheck, then the numbers earlier in this thread blow that theory out of the water. However, the use of generics that combine forces on us and the complexities by only working through combinators are both reasons for us to still re-evaluate our use of combine.
I have a prototype toml_edit
on top of nom
. I do not expect the polishing work to have a significant change in build performance.
Running cargo clean -p toml_edit && time cargo check
, I get 3s for combine
and 0.5s for nom
.
This is with Rust 1.66 on a i9-12900H processor
While its too early to read into these numbers, I ran cargo bench -F perf
-
cargo_manifest
was 20% slower -
map
cases were 20% faster -
array
cases were 30% faster
Parser is now fully compliant. The only regression is in error messages
Out of curiosity, how does the toml
crate performs the parsing?
Since it does not use combine nor nom, I suppose that it parses everything manually?
What's the plan for toml?
toml
has always been faster. I'd recommend checking out my old post to understand how far off they are from each other, why, and how much it actually matters.
The plan is #340. Once we met cargo's performance bar, no one else has commented on the performance, so we consider it good enough for migrating toml
to use toml_edit
s parser. We will do the migration in a new breaking release which will allow patching the old one if there are people who can't immediately upgrade which will allow us to collect feedback.
so we consider it good enough for migrating toml to use toml_edits parser
Would toml
simply uses toml_edits
as a dependency and re-export toml_edit::easy
, or would there be a toml_parser
which toml
uses and toml_edits::easy
then re-export toml
?
From #340
We plan to move toml_edit::easy into the toml crate, providing nearly the same API at nearly the same performance while reducing overall maintenance effort and allowing people who have both needs to have a common parser.
We do not want toml_edit
in toml
s public API as toml
is easier to maintain compatibility on and I'd like for it to be something people can rely on not changing. Or in other words, I want to take toml
to 1.0 soon after this change over happens.
btw this seems more like a discussion to be had in #340 than in this issue