word_smith
word_smith copied to clipboard
Modernize and update for latest Elixir.
- Updates the
unaccent.rules - Replaces Benchfella with Benchee (interesting results below)
- Handle new unaccent rules that don't have a replacement, but a full on removal
Squish
It seems the performance gains over the built in regex usage is negligible and can probably be swapped back.
Operating System: Linux
CPU Information: AMD Ryzen 9 5900X 12-Core Processor
Number of Available Cores: 24
Available memory: 125.72 GB
Elixir 1.18.2
Erlang 27.2
JIT enabled: true
Benchmark suite executing with the following configuration:
warmup: 5 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: 30 bytes, 300 bytes, 3000 bytes, 30000 bytes, 300000 bytes
Estimated total run time: 1 min 40 s
Benchmarking regex with input 30 bytes ...
Benchmarking regex with input 300 bytes ...
Benchmarking regex with input 3000 bytes ...
Benchmarking regex with input 30000 bytes ...
Benchmarking regex with input 300000 bytes ...
Benchmarking squish with input 30 bytes ...
Benchmarking squish with input 300 bytes ...
Benchmarking squish with input 3000 bytes ...
Benchmarking squish with input 30000 bytes ...
Benchmarking squish with input 300000 bytes ...
Calculating statistics...
Formatting results...
##### With input 30 bytes #####
Name ips average deviation median 99th %
squish 2.13 M 0.47 μs ±4226.03% 0.43 μs 0.61 μs
regex 0.52 M 1.91 μs ±621.34% 1.84 μs 2.09 μs
Comparison:
squish 2.13 M
regex 0.52 M - 4.08x slower +1.44 μs
##### With input 300 bytes #####
Name ips average deviation median 99th %
squish 88.30 K 11.33 μs ±35.02% 11.14 μs 13.24 μs
regex 87.72 K 11.40 μs ±35.39% 11.16 μs 13.78 μs
Comparison:
squish 88.30 K
regex 87.72 K - 1.01x slower +0.0750 μs
##### With input 3000 bytes #####
Name ips average deviation median 99th %
regex 9.56 K 104.60 μs ±4.27% 104.34 μs 111.23 μs
squish 9.48 K 105.44 μs ±4.31% 105.30 μs 111.78 μs
Comparison:
regex 9.56 K
squish 9.48 K - 1.01x slower +0.83 μs
##### With input 30000 bytes #####
Name ips average deviation median 99th %
regex 964.17 1.04 ms ±1.52% 1.03 ms 1.08 ms
squish 959.31 1.04 ms ±0.88% 1.04 ms 1.06 ms
Comparison:
regex 964.17
squish 959.31 - 1.01x slower +0.00526 ms
##### With input 300000 bytes #####
Name ips average deviation median 99th %
regex 77.60 12.89 ms ±7.27% 12.84 ms 15.29 ms
squish 76.30 13.11 ms ±7.89% 13.07 ms 15.73 ms
Comparison:
regex 77.60
squish 76.30 - 1.02x slower +0.22 ms
Remove Accents
Because the unaccents is such a huge list now, this probably needs some rethinking maybe. Unsure just yet.
Operating System: Linux
CPU Information: AMD Ryzen 9 5900X 12-Core Processor
Number of Available Cores: 24
Available memory: 125.72 GB
Elixir 1.18.2
Erlang 27.2
JIT enabled: true
Benchmark suite executing with the following configuration:
warmup: 5 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: 24 bytes, 240 bytes, 2400 bytes, 24000 bytes, 240000 bytes, 2400000 bytes
Estimated total run time: 1 min
Benchmarking remove_accents with input 24 bytes ...
Benchmarking remove_accents with input 240 bytes ...
Benchmarking remove_accents with input 2400 bytes ...
Benchmarking remove_accents with input 24000 bytes ...
Benchmarking remove_accents with input 240000 bytes ...
Benchmarking remove_accents with input 2400000 bytes ...
Calculating statistics...
Formatting results...
##### With input 24 bytes #####
Name ips average deviation median 99th %
remove_accents 4.02 M 248.76 ns ±7741.29% 211 ns 390 ns
##### With input 240 bytes #####
Name ips average deviation median 99th %
remove_accents 527.33 K 1.90 μs ±760.80% 1.67 μs 2.86 μs
##### With input 2400 bytes #####
Name ips average deviation median 99th %
remove_accents 64.49 K 15.51 μs ±37.02% 14.42 μs 22.03 μs
##### With input 24000 bytes #####
Name ips average deviation median 99th %
remove_accents 5.57 K 179.41 μs ±5.74% 177.60 μs 211.60 μs
##### With input 240000 bytes #####
Name ips average deviation median 99th %
remove_accents 348.19 2.87 ms ±22.63% 2.96 ms 4.12 ms
##### With input 2400000 bytes #####
Name ips average deviation median 99th %
remove_accents 23.96 41.74 ms ±31.20% 37.05 ms 96.72 ms