elixir
elixir copied to clipboard
do Enum.with_index/2 tail-recursively on lists
Hi folks!
This PR modifies the implementation of Enum.with_index/2, specifically with regard to lists.
The tl;dr is that there is a fair bit of throughput to be gained on both large and small lists at the expense of some memory.
Is this a tradeoff the project wants to make? That's not for me to say, but I just wanted to share this work in case it was. I will say I'm going to be using Enum.with_index/2 on some fairly large data processing at work, so I do have a personal bias toward throughput in this scenario.
The usual caveats apply here, as I ran all benchmarking on my development machine while trying to keep as few things running as possible to minimize noisy neighbors, but it is a development machine, not a dedicated host.
Benchmark code:
IO.inspect(Mix.env(), label: "Mix env")
f = fn index, item ->
{index, item + 100}
end
tiny = Enum.into(1..100, [])
small = Enum.into(1..1000, [])
medium = Enum.into(1..10_000, [])
bigger = Enum.into(1..100_000, [])
large = Enum.into(1..1_000_000, [])
Benchee.run(
%{
"Enum.with_index/2 with fn" => fn input -> Enum.with_index(input, f) end,
"Enum.with_index/2 with offset" => fn input -> Enum.with_index(input, 8) end
},
inputs: %{
"a. tiny - 100" => tiny,
"b. small - 1,000" => small,
"c. medium - 10,000" => medium,
"d. bigger - 100,000" => bigger,
"e. large - 1,000,000" => large
},
time: 6,
memory_time: 2
)
To run the benchmarks I first built Elixir on main like make clean && make, then ran them with ~/code/elixir/bin/elixir ~/code/elixir/bin/mix clean && MIX_ENV=prod ~/code/elixir/bin/elixir ~/code/elixir/bin/mix run bench.exs, then changed branches to this branch, then rebuilt Elixir with make clean && make, and then reran the benchmarks with the same command as before.
Benchmark on main
$ ~/code/elixir/bin/elixir ~/code/elixir/bin/mix clean && MIX_ENV=prod ~/code/elixir/bin/elixir ~/code/elixir/bin/mix run bench.exs
Compiling 2 files (.ex)
Generated wi2 app
Mix env: :prod
Operating System: macOS
CPU Information: Apple M1 Max
Number of Available Cores: 10
Available memory: 64 GB
Elixir 1.18.0-dev
Erlang 27.1.2
JIT enabled: true
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 6 s
memory time: 2 s
reduction time: 0 ns
parallel: 1
inputs: a. tiny - 100, b. small - 1,000, c. medium - 10,000, d. bigger - 100,000, e. large - 1,000,000
Estimated total run time: 1 min 40 s
Benchmarking Enum.with_index/2 with fn with input a. tiny - 100 ...
Benchmarking Enum.with_index/2 with fn with input b. small - 1,000 ...
Benchmarking Enum.with_index/2 with fn with input c. medium - 10,000 ...
Benchmarking Enum.with_index/2 with fn with input d. bigger - 100,000 ...
Benchmarking Enum.with_index/2 with fn with input e. large - 1,000,000 ...
Benchmarking Enum.with_index/2 with offset with input a. tiny - 100 ...
Benchmarking Enum.with_index/2 with offset with input b. small - 1,000 ...
Benchmarking Enum.with_index/2 with offset with input c. medium - 10,000 ...
Benchmarking Enum.with_index/2 with offset with input d. bigger - 100,000 ...
Benchmarking Enum.with_index/2 with offset with input e. large - 1,000,000 ...
Calculating statistics...
Formatting results...
##### With input a. tiny - 100 #####
Name ips average deviation median 99th %
Enum.with_index/2 with fn 1.10 M 0.91 μs ±1950.65% 0.83 μs 1.04 μs
Enum.with_index/2 with offset 0.98 M 1.02 μs ±2040.57% 0.79 μs 1.13 μs
Comparison:
Enum.with_index/2 with fn 1.10 M
Enum.with_index/2 with offset 0.98 M - 1.13x slower +0.118 μs
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with fn 3.91 KB
Enum.with_index/2 with offset 3.91 KB - 1.00x memory usage +0 KB
**All measurements for memory usage were the same**
##### With input b. small - 1,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 107.96 K 9.26 μs ±97.51% 7.83 μs 35.46 μs
Enum.with_index/2 with fn 104.87 K 9.54 μs ±75.92% 9.13 μs 13.50 μs
Comparison:
Enum.with_index/2 with offset 107.96 K
Enum.with_index/2 with fn 104.87 K - 1.03x slower +0.27 μs
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 39.06 KB
Enum.with_index/2 with fn 39.06 KB - 1.00x memory usage +0 KB
**All measurements for memory usage were the same**
##### With input c. medium - 10,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 10.44 K 95.82 μs ±3.89% 96.17 μs 103.42 μs
Enum.with_index/2 with fn 7.65 K 130.64 μs ±12.93% 128.17 μs 252.25 μs
Comparison:
Enum.with_index/2 with offset 10.44 K
Enum.with_index/2 with fn 7.65 K - 1.36x slower +34.82 μs
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 390.63 KB
Enum.with_index/2 with fn 390.63 KB - 1.00x memory usage +0 KB
**All measurements for memory usage were the same**
##### With input d. bigger - 100,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 868.12 1.15 ms ±9.55% 1.15 ms 1.18 ms
Enum.with_index/2 with fn 585.40 1.71 ms ±7.09% 1.70 ms 1.83 ms
Comparison:
Enum.with_index/2 with offset 868.12
Enum.with_index/2 with fn 585.40 - 1.48x slower +0.56 ms
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 3.81 MB
Enum.with_index/2 with fn 3.81 MB - 1.00x memory usage +0.00005 MB
**All measurements for memory usage were the same**
##### With input e. large - 1,000,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 40.75 24.54 ms ±33.75% 22.54 ms 44.82 ms
Enum.with_index/2 with fn 28.37 35.24 ms ±90.64% 29.13 ms 174.37 ms
Comparison:
Enum.with_index/2 with offset 40.75
Enum.with_index/2 with fn 28.37 - 1.44x slower +10.71 ms
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 38.15 MB
Enum.with_index/2 with fn 38.15 MB - 1.00x memory usage -0.00060 MB
**All measurements for memory usage were the same**
Benchmark on this branch
$ ~/code/elixir/bin/elixir ~/code/elixir/bin/mix clean && MIX_ENV=prod ~/code/elixir/bin/elixir ~/code/elixir/bin/mix run bench.exs
Compiling 2 files (.ex)
Generated wi2 app
Mix env: :prod
Operating System: macOS
CPU Information: Apple M1 Max
Number of Available Cores: 10
Available memory: 64 GB
Elixir 1.18.0-dev
Erlang 27.1.2
JIT enabled: true
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 6 s
memory time: 2 s
reduction time: 0 ns
parallel: 1
inputs: a. tiny - 100, b. small - 1,000, c. medium - 10,000, d. bigger - 100,000, e. large - 1,000,000
Estimated total run time: 1 min 40 s
Benchmarking Enum.with_index/2 with fn with input a. tiny - 100 ...
Benchmarking Enum.with_index/2 with fn with input b. small - 1,000 ...
Benchmarking Enum.with_index/2 with fn with input c. medium - 10,000 ...
Benchmarking Enum.with_index/2 with fn with input d. bigger - 100,000 ...
Benchmarking Enum.with_index/2 with fn with input e. large - 1,000,000 ...
Benchmarking Enum.with_index/2 with offset with input a. tiny - 100 ...
Benchmarking Enum.with_index/2 with offset with input b. small - 1,000 ...
Benchmarking Enum.with_index/2 with offset with input c. medium - 10,000 ...
Benchmarking Enum.with_index/2 with offset with input d. bigger - 100,000 ...
Benchmarking Enum.with_index/2 with offset with input e. large - 1,000,000 ...
Calculating statistics...
Formatting results...
##### With input a. tiny - 100 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 2.48 M 403.93 ns ±5368.21% 333 ns 541 ns
Enum.with_index/2 with fn 1.69 M 591.08 ns ±4174.85% 541 ns 709 ns
Comparison:
Enum.with_index/2 with offset 2.48 M
Enum.with_index/2 with fn 1.69 M - 1.46x slower +187.15 ns
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 3.94 KB
Enum.with_index/2 with fn 5.47 KB - 1.39x memory usage +1.53 KB
**All measurements for memory usage were the same**
##### With input b. small - 1,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 247.39 K 4.04 μs ±186.39% 3.58 μs 13.88 μs
Enum.with_index/2 with fn 172.72 K 5.79 μs ±89.57% 5.25 μs 17.17 μs
Comparison:
Enum.with_index/2 with offset 247.39 K
Enum.with_index/2 with fn 172.72 K - 1.43x slower +1.75 μs
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 54.69 KB
Enum.with_index/2 with fn 54.69 KB - 1.00x memory usage +0 KB
**All measurements for memory usage were the same**
##### With input c. medium - 10,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with fn 16.55 K 60.43 μs ±7.29% 60.08 μs 67.42 μs
Enum.with_index/2 with offset 15.75 K 63.49 μs ±87.38% 42.71 μs 260.67 μs
Comparison:
Enum.with_index/2 with fn 16.55 K
Enum.with_index/2 with offset 15.75 K - 1.05x slower +3.06 μs
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with fn 430.33 KB
Enum.with_index/2 with offset 430.36 KB - 1.00x memory usage +0.0313 KB
**All measurements for memory usage were the same**
##### With input d. bigger - 100,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 1.10 K 0.91 ms ±4.67% 0.90 ms 0.99 ms
Enum.with_index/2 with fn 0.95 K 1.06 ms ±14.70% 1.05 ms 1.27 ms
Comparison:
Enum.with_index/2 with offset 1.10 K
Enum.with_index/2 with fn 0.95 K - 1.17x slower +0.151 ms
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 4.83 MB
Enum.with_index/2 with fn 4.92 MB - 1.02x memory usage +0.0931 MB
**All measurements for memory usage were the same**
##### With input e. large - 1,000,000 #####
Name ips average deviation median 99th %
Enum.with_index/2 with offset 56.54 17.69 ms ±32.77% 18.57 ms 32.38 ms
Enum.with_index/2 with fn 42.61 23.47 ms ±31.16% 24.80 ms 40.53 ms
Comparison:
Enum.with_index/2 with offset 56.54
Enum.with_index/2 with fn 42.61 - 1.33x slower +5.79 ms
Memory usage statistics:
Name Memory usage
Enum.with_index/2 with offset 51.49 MB
Enum.with_index/2 with fn 53.32 MB - 1.04x memory usage +1.82 MB
**All measurements for memory usage were the same**
Thank you for the PR and the excellent work! The general guideline is that list-recursive vs tail-recursive is roughly the same (often trading CPU and memory, as you measured) and we should go with the code that is cleaner (which tends to be the one that does not pass the accumulator around). So I'd vote to keep it as is. But maybe @sabiwara has other thoughts.
Here is the PR where we originally discussed this: https://github.com/elixir-lang/elixir/pull/11765
I think nobody had strong opinions back then, either seemed fine so we kept the simpler one.
Wow, interesting. Regardless of your decision, that PR (and the links in it) is great context I had not seen before, thank you.
Closing this then, thank you :)