genanki-rs
genanki-rs copied to clipboard
Don't recompile the regexes every `Note::new`/`Note::write_to_db`
Hi there - first of all thank you for making this crate, it's super useful!
Motivation
I noticed when profiling my application that quite a chunk of time was being spent in Note::new
(I end up creating 193 notes in this example):
jacksonriley@192 ~/zw_to_anki (main)> hyperfine -- "./target/release/zw_to_anki -f E08_417.txt -o test.apkg"
Benchmark #1: ./target/release/zw_to_anki -f E08_417.txt -o test.apkg
Time (mean ± σ): 4.450 s ± 0.176 s [User: 4.002 s, System: 0.216 s]
Range (min … max): 4.254 s … 4.867 s 10 runs
This looks to be mainly due to compiling the "sentinel regex" for every field every time: https://github.com/yannickfunk/genanki-rs/blob/fe983cf519e727f8265f920f61e6c06ffe94bee4/src/model.rs#L245
Since the fields never change, I think we can pre-compile the regexes to save this time.
Benchmark post-change
With this change:
jacksonriley@192 ~/zw_to_anki (main)> hyperfine -- "./target/release/zw_to_anki -f E08_417.txt -o test.apkg"
Benchmark #1: ./target/release/zw_to_anki -f E08_417.txt -o test.apkg
Time (mean ± σ): 1.978 s ± 0.223 s [User: 1.628 s, System: 0.190 s]
Range (min … max): 1.820 s … 2.573 s 10 runs
Which comes out as making Note::new
go from taking 12.6ms per-call to 0.1ms.
Adding some once-cells
I then noticed that there were some more regexes getting compiled every Note::write_to_db
, so have made these only get compiled once using once_cell
.
This then cuts the time down further:
jacksonriley@192 ~/zw_to_anki (main)> hyperfine -- "./target/release/zw_to_anki -f E08_417.txt -o test.apkg"
Benchmark #1: ./target/release/zw_to_anki -f E08_417.txt -o test.apkg
Time (mean ± σ): 1.613 s ± 0.051 s [User: 1.374 s, System: 0.186 s]
Range (min … max): 1.548 s … 1.737 s 10 runs
and now genanki-rs
is barely visible anymore on the flamegraph!
Implementation
I think that what I've done is reasonable - fine to add a new private field to Model
and the implementation seems to work. Happy to change up the implementation however you see fit though!