Monomorphization improvement
Hey,
I was doing some build-time profiling on a project and tried cargo llvm-lines which helps visualizing the amount of code generated by the monomorphization of generic functions. Doing so, I saw that some of the biggest contributors were functions from tonic.
For some of them it is possible de reduce the amount of monomophization quite easily.
This is the result of my iteration on the biggest contributor which was refactorable. It should help for smaller binaries and faster build time.
Output of cargo llvm-lines on my project, before:
Lines Copies Function name
----- ------ -------------
30754 (1.2%) 17 (0.0%) <tonic::codec::decode::Streaming<T> as futures_core::stream::Stream>::poll_next
13977 (0.5%) 17 (0.0%) tonic::codec::decode::Streaming<T>::decode_chunk
7158 (0.3%) 14 (0.0%) tonic::codec::decode::Streaming<T>::trailers::{{closure}}
742 (0.0%) 42 (0.1%) tonic::codec::decode::Streaming<T>::trailers::{{closure}}::{{closure}}
210 (0.0%) 14 (0.0%) tonic::codec::decode::Streaming<T>::trailers
And after:
Lines Copies Function name
----- ------ -------------
6339 (0.3%) 17 (0.0%) <tonic::codec::decode::Streaming<T> as futures_core::stream::Stream>::poll_next
5730 (0.2%) 14 (0.0%) tonic::codec::decode::Streaming<T>::trailers::{{closure}}
4952 (0.2%) 17 (0.0%) tonic::codec::decode::Streaming<T>::decode_chunk
221 (0.0%) 1 (0.0%) tonic::codec::decode::StreamingInner::trailers::{{closure}}
210 (0.0%) 14 (0.0%) tonic::codec::decode::Streaming<T>::trailers
53 (0.0%) 3 (0.0%) tonic::codec::decode::StreamingInner::trailers::{{closure}}::{{closure}}
This gives an about 35_336 llvm-lines reduction
This seems really cool! Do you have any compile time benchmarks by chance? Curious what the real impact is.
This awkward moment when the benchmark decided not to go your way.
With the attached minimal project:
Before this commit
> cargo b --release
> stat target/release/minimal-tonic-monomorphization -c "%s"
7_463_512
> hyperfine --prepare "rm -rf target/release/build/minimal-tonic-monomorphization-*" "cargo b --release"
Benchmark #1: cargo b --release
Time (mean ± σ): 12.133 s ± 0.185 s [User: 48.109 s, System: 0.762 s]
Range (min … max): 11.855 s … 12.439 s 10 runs
After this commit
> cargo b --release
> stat target/release/minimal-tonic-monomorphization -c "%s"
7_451_720
> hyperfine --prepare "rm -rf target/release/build/minimal-tonic-monomorphization-*" "cargo b --release"
Benchmark #1: cargo b --release
Time (mean ± σ): 12.670 s ± 0.178 s [User: 48.070 s, System: 0.744 s]
Range (min … max): 12.530 s … 13.127 s 10 runs
The binary did decreased by 12ko, but the build time appears a bit longer. Which feels weird.