From chatter on Zulip, it looks like flume is a candidate to replace
std::sync::mpsc if it is not deprecated. This is an experiment to try
it instead of std or crossbeam-channels.
My benchmarking indicates that flume is faster than std, but not quite
as fast as crossbeam. I'm curious @sharkdp if you still see a perf
regression with this implementation?
I ran some benchmarks comparing master, this branch, and using crossbeam-channel.
For my photos directory (on spinning disk):
fd
regression benchmark
No pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master --hidden --no-ignore '' '/home/thayne/bulk-home/Pictures' |
10.5 ± 3.3 |
5.4 |
20.6 |
1.00 |
./fd-flume --hidden --no-ignore '' '/home/thayne/bulk-home/Pictures' |
10.8 ± 3.1 |
5.8 |
19.0 |
1.03 ± 0.44 |
./fd-crossbeam --hidden --no-ignore '' '/home/thayne/bulk-home/Pictures' |
11.6 ± 3.9 |
5.0 |
22.3 |
1.10 ± 0.51 |
Simple pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
11.3 ± 3.5 |
5.8 |
22.2 |
1.00 |
./fd-flume '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
11.7 ± 3.3 |
7.2 |
22.5 |
1.04 ± 0.44 |
./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
13.4 ± 4.6 |
6.9 |
26.6 |
1.19 ± 0.55 |
Simple pattern (-HI)
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
10.7 ± 3.5 |
5.8 |
21.9 |
1.03 ± 0.46 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
10.4 ± 3.1 |
6.1 |
24.1 |
1.00 |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
11.5 ± 4.0 |
5.1 |
23.7 |
1.10 ± 0.51 |
File extension
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --extension jpg '' '/home/thayne/bulk-home/Pictures' |
14.8 ± 3.5 |
9.7 |
24.5 |
1.01 ± 0.31 |
./fd-flume -HI --extension jpg '' '/home/thayne/bulk-home/Pictures' |
14.7 ± 2.9 |
9.8 |
24.6 |
1.00 |
./fd-crossbeam -HI --extension jpg '' '/home/thayne/bulk-home/Pictures' |
15.0 ± 3.8 |
9.5 |
27.6 |
1.02 ± 0.33 |
File type
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --type l '' '/home/thayne/bulk-home/Pictures' |
10.7 ± 3.8 |
5.2 |
21.5 |
1.06 ± 0.56 |
./fd-flume -HI --type l '' '/home/thayne/bulk-home/Pictures' |
10.1 ± 3.9 |
3.7 |
21.6 |
1.00 |
./fd-crossbeam -HI --type l '' '/home/thayne/bulk-home/Pictures' |
11.3 ± 3.9 |
5.0 |
20.1 |
1.12 ± 0.58 |
Cold cache
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
112.4 ± 69.8 |
71.1 |
193.0 |
1.75 ± 1.18 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
70.8 ± 28.1 |
57.1 |
144.3 |
1.11 ± 0.53 |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures' |
64.1 ± 16.8 |
55.5 |
111.5 |
1.00 |
flume and std seem to be pretty close, in most tests, crossbeam seems to be a little bit slower. On the cold cache, master was significantly slower than both, and crossbeam was fastest. but maybe the reset cache command wasn't working as expected, and the order mattered? Does hyperfine run all the tests for the first command before doing the second, or does it intersperse them?
Haproxy repository (SSD) order 1
fd
regression benchmark
No pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-crossbeam --hidden --no-ignore '' '/home/thayne/dev/haproxy' |
10.4 ± 3.5 |
5.8 |
22.6 |
1.02 ± 0.47 |
./fd-flume --hidden --no-ignore '' '/home/thayne/dev/haproxy' |
10.1 ± 3.1 |
6.0 |
21.6 |
1.00 |
./fd-master --hidden --no-ignore '' '/home/thayne/dev/haproxy' |
10.6 ± 3.3 |
5.3 |
19.8 |
1.05 ± 0.45 |
Simple pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
8.8 ± 4.0 |
3.0 |
16.8 |
1.00 |
./fd-flume '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
8.9 ± 4.0 |
2.4 |
16.6 |
1.00 ± 0.64 |
./fd-master '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
9.4 ± 4.0 |
3.3 |
17.1 |
1.06 ± 0.65 |
Simple pattern (-HI)
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
12.2 ± 4.1 |
6.4 |
22.5 |
1.02 ± 0.50 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
11.9 ± 4.2 |
6.1 |
21.6 |
1.00 |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
12.2 ± 4.5 |
6.0 |
22.1 |
1.03 ± 0.52 |
File extension
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-crossbeam -HI --extension jpg '' '/home/thayne/dev/haproxy' |
12.1 ± 3.7 |
6.9 |
21.4 |
1.00 |
./fd-flume -HI --extension jpg '' '/home/thayne/dev/haproxy' |
13.1 ± 4.3 |
7.0 |
24.1 |
1.09 ± 0.49 |
./fd-master -HI --extension jpg '' '/home/thayne/dev/haproxy' |
13.2 ± 4.3 |
7.2 |
24.1 |
1.09 ± 0.49 |
File type
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-crossbeam -HI --type l '' '/home/thayne/dev/haproxy' |
11.9 ± 4.5 |
5.7 |
24.1 |
1.08 ± 0.56 |
./fd-flume -HI --type l '' '/home/thayne/dev/haproxy' |
11.0 ± 3.9 |
5.6 |
21.6 |
1.00 |
./fd-master -HI --type l '' '/home/thayne/dev/haproxy' |
11.6 ± 4.2 |
5.9 |
22.2 |
1.05 ± 0.53 |
Cold cache
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
123.6 ± 39.0 |
103.4 |
193.2 |
1.14 ± 0.37 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
128.1 ± 41.3 |
108.2 |
212.2 |
1.18 ± 0.39 |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
108.5 ± 9.1 |
101.7 |
129.4 |
1.00 |
haproxy repository (SSD) order 2
fd
regression benchmark
No pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master --hidden --no-ignore '' '/home/thayne/dev/haproxy' |
9.1 ± 2.9 |
5.5 |
18.5 |
1.00 |
./fd-flume --hidden --no-ignore '' '/home/thayne/dev/haproxy' |
9.6 ± 3.1 |
5.5 |
21.2 |
1.06 ± 0.48 |
./fd-crossbeam --hidden --no-ignore '' '/home/thayne/dev/haproxy' |
9.8 ± 3.4 |
5.5 |
20.8 |
1.08 ± 0.51 |
Simple pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
8.5 ± 3.8 |
1.4 |
16.0 |
1.04 ± 0.67 |
./fd-flume '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
8.4 ± 3.9 |
1.8 |
15.8 |
1.03 ± 0.68 |
./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
8.1 ± 3.8 |
1.3 |
15.5 |
1.00 |
Simple pattern (-HI)
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
11.1 ± 3.4 |
6.9 |
21.3 |
1.00 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
11.8 ± 4.1 |
6.9 |
22.8 |
1.06 ± 0.49 |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
13.5 ± 4.2 |
7.3 |
24.5 |
1.21 ± 0.53 |
File extension
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --extension jpg '' '/home/thayne/dev/haproxy' |
10.5 ± 4.2 |
4.6 |
20.7 |
1.00 |
./fd-flume -HI --extension jpg '' '/home/thayne/dev/haproxy' |
11.1 ± 3.9 |
5.9 |
21.7 |
1.05 ± 0.56 |
./fd-crossbeam -HI --extension jpg '' '/home/thayne/dev/haproxy' |
11.6 ± 4.3 |
5.0 |
20.9 |
1.10 ± 0.59 |
File type
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --type l '' '/home/thayne/dev/haproxy' |
12.8 ± 4.1 |
7.0 |
22.0 |
1.04 ± 0.47 |
./fd-flume -HI --type l '' '/home/thayne/dev/haproxy' |
12.4 ± 3.9 |
6.5 |
22.3 |
1.00 |
./fd-crossbeam -HI --type l '' '/home/thayne/dev/haproxy' |
13.0 ± 4.4 |
7.3 |
26.6 |
1.05 ± 0.49 |
Cold cache
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
132.1 ± 25.5 |
118.4 |
177.7 |
1.20 ± 0.26 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
110.2 ± 10.6 |
101.6 |
134.8 |
1.00 |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy' |
112.4 ± 3.4 |
108.0 |
116.4 |
1.02 ± 0.10 |
Rust-lang repository (spinning disk)
fd
regression benchmark
No pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master --hidden --no-ignore '' '/home/thayne/bulk-home/devel/rust-lang' |
254.5 ± 2.1 |
251.7 |
259.7 |
1.01 ± 0.01 |
./fd-flume --hidden --no-ignore '' '/home/thayne/bulk-home/devel/rust-lang' |
264.0 ± 2.4 |
259.5 |
267.4 |
1.05 ± 0.01 |
./fd-crossbeam --hidden --no-ignore '' '/home/thayne/bulk-home/devel/rust-lang' |
251.2 ± 2.7 |
247.3 |
255.5 |
1.00 |
Simple pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
346.4 ± 2.6 |
341.9 |
351.0 |
1.00 |
./fd-flume '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
347.9 ± 11.8 |
337.7 |
373.0 |
1.00 ± 0.03 |
./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
354.4 ± 8.7 |
344.8 |
367.6 |
1.02 ± 0.03 |
Simple pattern (-HI)
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
245.1 ± 3.4 |
240.7 |
253.1 |
1.00 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
249.9 ± 5.4 |
246.1 |
264.5 |
1.02 ± 0.03 |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
246.5 ± 4.8 |
241.4 |
256.7 |
1.01 ± 0.02 |
File extension
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --extension jpg '' '/home/thayne/bulk-home/devel/rust-lang' |
253.7 ± 4.0 |
249.3 |
260.8 |
1.00 |
./fd-flume -HI --extension jpg '' '/home/thayne/bulk-home/devel/rust-lang' |
254.6 ± 2.9 |
250.5 |
259.5 |
1.00 ± 0.02 |
./fd-crossbeam -HI --extension jpg '' '/home/thayne/bulk-home/devel/rust-lang' |
254.6 ± 3.5 |
248.2 |
261.7 |
1.00 ± 0.02 |
File type
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --type l '' '/home/thayne/bulk-home/devel/rust-lang' |
243.7 ± 2.4 |
239.5 |
248.0 |
1.00 |
./fd-flume -HI --type l '' '/home/thayne/bulk-home/devel/rust-lang' |
247.4 ± 2.5 |
245.0 |
254.2 |
1.02 ± 0.01 |
./fd-crossbeam -HI --type l '' '/home/thayne/bulk-home/devel/rust-lang' |
249.5 ± 2.3 |
246.7 |
253.4 |
1.02 ± 0.01 |
Cold cache
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
29.077 ± 0.413 |
28.751 |
29.542 |
1.00 ± 0.02 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
28.985 ± 0.155 |
28.849 |
29.153 |
1.00 |
./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang' |
29.034 ± 0.238 |
28.760 |
29.183 |
1.00 ± 0.01 |
I'm not entirely sure if the differences are do to actual performance differences, or something wiht how I'm running the benchmarks.
Thank you for looking into this again @tavianator. And thank you for the benchmark results, @tmccombs.
but maybe the reset cache command wasn't working as expected, and the order mattered? Does hyperfine run all the tests for the first command before doing the second, or does it intersperse them?
It does run all the benchmarks for the first command before doing the second. See also https://github.com/sharkdp/hyperfine/issues/21
The general problem with your benchmarks is the large statistical noise. Look at the very first benchmark, for example. A result like 1.03 ± 0.44 for flume (with respect to master) means: flume was 3% slower, but there is a statistical uncertainty of 44 percentage points, i.e. the error is an order of magnitude larger than the measured effect. Maybe hyperfine should come with a big warning in a case like this. That 3% performance benefit result really shouldn't be trusted.
Compare that to the "No pattern" benchmark on my machine (note: this is on a larger folder, to increase signal-to-noise even more):
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master --hidden --no-ignore '' '/home/shark/Informatik/' |
385.0 ± 0.9 |
383.3 |
386.2 |
1.00 |
./fd-flume --hidden --no-ignore '' '/home/shark/Informatik/' |
402.1 ± 1.8 |
400.0 |
406.2 |
1.04 ± 0.01 |
Here, the statistical error (0.01) is much quite a bit slower than the effect we are seeing (0.04).
It's annoying, but it's really important to switch off background processes. Especially the ones that might be reading from / writing to disk. Largest offenders for me are typically: dropbox(!), spotify, the browser.
Please find the full benchmark results from my machine (solid state disk) below. I would summarize them as: there is no statistically significant difference between the master
version and the version from this branch, except for the "no pattern" benchmark, where the flume-version is 4% slower (reproducibly).
@tavianator Could you maybe also share your benchmark results?
fd
regression benchmark
No pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master --hidden --no-ignore '' '/home/shark/Informatik/' |
385.0 ± 0.9 |
383.3 |
386.2 |
1.00 |
./fd-flume --hidden --no-ignore '' '/home/shark/Informatik/' |
402.1 ± 1.8 |
400.0 |
406.2 |
1.04 ± 0.01 |
Simple pattern
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master '.*[0-9]\.jpg$' '/home/shark/Informatik/' |
177.3 ± 1.1 |
175.9 |
179.2 |
1.00 ± 0.01 |
./fd-flume '.*[0-9]\.jpg$' '/home/shark/Informatik/' |
176.7 ± 0.9 |
175.0 |
179.5 |
1.00 |
Simple pattern (-HI)
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/' |
361.7 ± 1.6 |
359.0 |
364.2 |
1.01 ± 0.01 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/' |
358.7 ± 2.1 |
356.2 |
361.8 |
1.00 |
File extension
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --extension jpg '' '/home/shark/Informatik/' |
392.7 ± 9.0 |
379.9 |
406.6 |
1.01 ± 0.03 |
./fd-flume -HI --extension jpg '' '/home/shark/Informatik/' |
388.7 ± 6.6 |
383.0 |
400.2 |
1.00 |
File type
Command |
Mean [ms] |
Min [ms] |
Max [ms] |
Relative |
./fd-master -HI --type l '' '/home/shark/Informatik/' |
359.2 ± 0.8 |
357.9 |
360.7 |
1.01 ± 0.01 |
./fd-flume -HI --type l '' '/home/shark/Informatik/' |
357.2 ± 2.2 |
353.8 |
360.2 |
1.00 |
Cold cache
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/' |
3.008 ± 0.017 |
2.995 |
3.027 |
1.00 ± 0.01 |
./fd-flume -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/' |
3.007 ± 0.015 |
2.994 |
3.023 |
1.00 |
Interesting. The no-pattern case would probably fill up the queues more. This seems to contradict the benchmarks in flume's README. Is it worth bring up with flume and/or crossbeam-channel that you aren't seeing the performance gains that they claim over mpsc?
Here's my results. Weirdly --extension
is 7% slower with flume, but otherwise flume wins by 1-5%.
fd
regression benchmark
No pattern
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master --hidden --no-ignore '' '/home/tavianator/code/android' |
2.069 ± 0.006 |
2.058 |
2.076 |
1.05 ± 0.01 |
./fd-feature --hidden --no-ignore '' '/home/tavianator/code/android' |
1.973 ± 0.024 |
1.955 |
2.039 |
1.00 |
Simple pattern
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master '.*[0-9]\.jpg$' '/home/tavianator/code/android' |
1.370 ± 0.010 |
1.358 |
1.389 |
1.01 ± 0.01 |
./fd-feature '.*[0-9]\.jpg$' '/home/tavianator/code/android' |
1.359 ± 0.003 |
1.355 |
1.365 |
1.00 |
Simple pattern (-HI)
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android' |
2.046 ± 0.003 |
2.043 |
2.051 |
1.05 ± 0.01 |
./fd-feature -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android' |
1.946 ± 0.020 |
1.933 |
2.000 |
1.00 |
File extension
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master -HI --extension jpg '' '/home/tavianator/code/android' |
1.947 ± 0.006 |
1.938 |
1.956 |
1.00 |
./fd-feature -HI --extension jpg '' '/home/tavianator/code/android' |
2.088 ± 0.032 |
2.065 |
2.153 |
1.07 ± 0.02 |
File type
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master -HI --type l '' '/home/tavianator/code/android' |
2.031 ± 0.004 |
2.025 |
2.038 |
1.05 ± 0.00 |
./fd-feature -HI --type l '' '/home/tavianator/code/android' |
1.937 ± 0.005 |
1.932 |
1.948 |
1.00 |
Cold cache
Command |
Mean [s] |
Min [s] |
Max [s] |
Relative |
./fd-master -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android' |
2.178 ± 0.004 |
2.175 |
2.183 |
1.05 ± 0.00 |
./fd-feature -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android' |
2.076 ± 0.003 |
2.074 |
2.079 |
1.00 |
@tavianator @tmccombs How do we proceed with this std-channels vs crossbeam vs flume topic? It feels to me like we need to work on #893 first. What do you think?
I don't know. I'd like to have a better understanding of why switching to crossbeam-channel or flume doesn't perform as well in some cases. Either of you have much experience profiling rust code?
I just found this by accident: https://github.com/fereidani/kanal
@sharkdp Yeah I saw that recently too! Curious to try it out, might be a little early though
I guess we can close this in favor of #1146