strsim-rs
strsim-rs copied to clipboard
avoid string copy in sorensen dice
Since we only need to iterate over the bigrams for each string once, we can create them lazily instead of collecting them into a string. This reduces the binary size by around 7%. In addition it reduces runtime in our current benchmark by around 11%.
For reference in my example binary this gives:
File .text Size Crate
6.0% 96.7% 275.8KiB std
0.1% 1.5% 4.3KiB strsim
0.0% 0.0% 124B rf_test
0.0% 0.0% 102B [Unknown]
6.3% 100.0% 285.3KiB .text section size, the file size is 4.5MiB
while previously it was:
File .text Size Crate
6.1% 96.6% 276.5KiB std
0.1% 1.6% 4.6KiB strsim
0.0% 0.0% 124B rf_test
0.0% 0.0% 102B [Unknown]
6.3% 100.0% 286.3KiB .text section size, the file size is 4.5MiB