Implicit boxing of large futures causes excessive monomorphization
async-task contains an optimisation for handling large futures in the definition of spawn_unchecked. This leads to excessive IR size, as one branch instantiates RawTask with Fut and the other does so with Pin<Box<Fut>>. This probably gets eliminated within LLVM (as the branch itself is trivial), but it's still a bummer that this cannot be truly determined at compile time. I took several stabs recently at getting rid of the unnecessary instantiation, without luck. I do understand why we need the boxing, but it'd be nice to not spend time on generating code we're gonna throw away in LLVM anyways.
Getting rid of large-future-boxing reduces the LLVM IR size for my medium-size (~1.5M LLVM IR lines in debug) crate by 7%. This is also replicated in examples from this crate:
| Example name | LLVM IR before | LLVM IR after | % of original |
|---|---|---|---|
| spawn | 18276 | 15631 | 85.52% |
| spawn-local | 39801 | 34537 | 86.77% |
| spawn-on-thread | 18667 | 16031 | 85.87% |
| with-metadata | 32834 | 24887 | 75.79% |
Related: https://github.com/rust-lang/rust/issues/85836