tools
tools copied to clipboard
Formatter: Performance [Stretch]
This issue is mainly a place where I collect ideas on how we can improve the Formatter's performance (probably in significant ways).
Flat FormatElement Structure
I expect this to improve the performance because it can greatly reduce the number of allocations the formatter does.
What's this about. The idea is to change the nested FormatElement structure to a flat structure:
Empty->[]List(Vec(Elements))->[...Elements]Group(content)->[StartGroup, ...Content, EndGroup]Fill(elements)->[StartFill, ...elements, EndFill]Indent(content)->[StartIndent, ...content, EndFill]
Why should this improve performance:
- It removes the need to perform heap allocations for
Groups, andIndent. - The
Printerget's a single continuous array with elements. Removes the derefContent/ Vec pointers. May even be possible to write thePrinterin a way that it no longer needs it's inner queue and instead can just use a pointer to the position in the input document.
Challenges
The main challenges will be
- Rewriting the
Printerto work with a flat list - Rewriting
split_triviato work efficiently. (Maybe no longer needed after re-writting the comment formatting?)
Shared FormatElement Buffer #2634
The gist of the idea is to change Format to no longer return the formatted document but instead make the format method write to a mutable `formatter.
// Before
fn format(&self, formatter: &Formatter) -> FormatResult<Document> {
return formatted![formatter, self.if_token().format(), space_token(), self.l_paren().format()]
}
// After
fn format(&self, formatter: &mut Formatter) -> FormatResult<Document> {
// Writes the format elements to `formatter` that maintains a shared buffer
write!(formatter, self.if_token().format(), space_token())?;
write!(formatter, self.l_paren().format())?;
}
This may look familiar to you. And there's good reason why, this design is heavily inspired by Rust's format, format_args, and write.
How does this improve performance
- Reduces the heap allocation from one allocation per
formatcall to 1 for formatting the whole document (excluding the heap allocations happening inside of a specificFormatimplementation) - ... I think that's more than reason enough
Why I'm excited about this
I haven't ran any benchmarks myself but there exists a benchmark for rust template libraries and the interesting thing is that you can group the libs in two groups:
- Libraries that do heap allocations
- Libraries that don't do heap allocations
And the libraries not doing heap allocations play in a complete different league: link. Just take a look how fast write is compared to something like Tera or Handlebars.
Challenges
- Again,
SplitTrivia,is_emptyetc. - This changes the formatter ideom from a functional style to an imperative API where order matters. That means it will be necessary to re-write/review every formatter.
Migration
I think it should be doable to implement a cheap v1/v2 adapter if Flat FormatElement is in place. This would allow us to rewrite the formatting on a per-node basis
Inspiration
Take a look at this branch where I played around with this a little while ago.
Depends on Flat FormatElement Structure