From chatter on Zulip, it looks like flume is a candidate to replace std::sync::mpsc if it is not deprecated. This is an experiment to try it instead of std or crossbeam-channels.

My benchmarking indicates that flume is faster than std, but not quite as fast as crossbeam. I'm curious @sharkdp if you still see a perf regression with this implementation?

Jan 11 '22 18:01 tavianator

I ran some benchmarks comparing master, this branch, and using crossbeam-channel.

For my photos directory (on spinning disk):

`fd` regression benchmark

No pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master --hidden --no-ignore '' '/home/thayne/bulk-home/Pictures'`	10.5 ± 3.3	5.4	20.6	1.00
`./fd-flume --hidden --no-ignore '' '/home/thayne/bulk-home/Pictures'`	10.8 ± 3.1	5.8	19.0	1.03 ± 0.44
`./fd-crossbeam --hidden --no-ignore '' '/home/thayne/bulk-home/Pictures'`	11.6 ± 3.9	5.0	22.3	1.10 ± 0.51

Simple pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	11.3 ± 3.5	5.8	22.2	1.00
`./fd-flume '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	11.7 ± 3.3	7.2	22.5	1.04 ± 0.44
`./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	13.4 ± 4.6	6.9	26.6	1.19 ± 0.55

Simple pattern (-HI)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	10.7 ± 3.5	5.8	21.9	1.03 ± 0.46
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	10.4 ± 3.1	6.1	24.1	1.00
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	11.5 ± 4.0	5.1	23.7	1.10 ± 0.51

File extension

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --extension jpg '' '/home/thayne/bulk-home/Pictures'`	14.8 ± 3.5	9.7	24.5	1.01 ± 0.31
`./fd-flume -HI --extension jpg '' '/home/thayne/bulk-home/Pictures'`	14.7 ± 2.9	9.8	24.6	1.00
`./fd-crossbeam -HI --extension jpg '' '/home/thayne/bulk-home/Pictures'`	15.0 ± 3.8	9.5	27.6	1.02 ± 0.33

File type

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --type l '' '/home/thayne/bulk-home/Pictures'`	10.7 ± 3.8	5.2	21.5	1.06 ± 0.56
`./fd-flume -HI --type l '' '/home/thayne/bulk-home/Pictures'`	10.1 ± 3.9	3.7	21.6	1.00
`./fd-crossbeam -HI --type l '' '/home/thayne/bulk-home/Pictures'`	11.3 ± 3.9	5.0	20.1	1.12 ± 0.58

Cold cache

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	112.4 ± 69.8	71.1	193.0	1.75 ± 1.18
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	70.8 ± 28.1	57.1	144.3	1.11 ± 0.53
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/Pictures'`	64.1 ± 16.8	55.5	111.5	1.00

flume and std seem to be pretty close, in most tests, crossbeam seems to be a little bit slower. On the cold cache, master was significantly slower than both, and crossbeam was fastest. but maybe the reset cache command wasn't working as expected, and the order mattered? Does hyperfine run all the tests for the first command before doing the second, or does it intersperse them?

Haproxy repository (SSD) order 1

`fd` regression benchmark

No pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-crossbeam --hidden --no-ignore '' '/home/thayne/dev/haproxy'`	10.4 ± 3.5	5.8	22.6	1.02 ± 0.47
`./fd-flume --hidden --no-ignore '' '/home/thayne/dev/haproxy'`	10.1 ± 3.1	6.0	21.6	1.00
`./fd-master --hidden --no-ignore '' '/home/thayne/dev/haproxy'`	10.6 ± 3.3	5.3	19.8	1.05 ± 0.45

Simple pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	8.8 ± 4.0	3.0	16.8	1.00
`./fd-flume '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	8.9 ± 4.0	2.4	16.6	1.00 ± 0.64
`./fd-master '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	9.4 ± 4.0	3.3	17.1	1.06 ± 0.65

Simple pattern (-HI)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	12.2 ± 4.1	6.4	22.5	1.02 ± 0.50
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	11.9 ± 4.2	6.1	21.6	1.00
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	12.2 ± 4.5	6.0	22.1	1.03 ± 0.52

File extension

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-crossbeam -HI --extension jpg '' '/home/thayne/dev/haproxy'`	12.1 ± 3.7	6.9	21.4	1.00
`./fd-flume -HI --extension jpg '' '/home/thayne/dev/haproxy'`	13.1 ± 4.3	7.0	24.1	1.09 ± 0.49
`./fd-master -HI --extension jpg '' '/home/thayne/dev/haproxy'`	13.2 ± 4.3	7.2	24.1	1.09 ± 0.49

File type

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-crossbeam -HI --type l '' '/home/thayne/dev/haproxy'`	11.9 ± 4.5	5.7	24.1	1.08 ± 0.56
`./fd-flume -HI --type l '' '/home/thayne/dev/haproxy'`	11.0 ± 3.9	5.6	21.6	1.00
`./fd-master -HI --type l '' '/home/thayne/dev/haproxy'`	11.6 ± 4.2	5.9	22.2	1.05 ± 0.53

Cold cache

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	123.6 ± 39.0	103.4	193.2	1.14 ± 0.37
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	128.1 ± 41.3	108.2	212.2	1.18 ± 0.39
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	108.5 ± 9.1	101.7	129.4	1.00

haproxy repository (SSD) order 2

`fd` regression benchmark

No pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master --hidden --no-ignore '' '/home/thayne/dev/haproxy'`	9.1 ± 2.9	5.5	18.5	1.00
`./fd-flume --hidden --no-ignore '' '/home/thayne/dev/haproxy'`	9.6 ± 3.1	5.5	21.2	1.06 ± 0.48
`./fd-crossbeam --hidden --no-ignore '' '/home/thayne/dev/haproxy'`	9.8 ± 3.4	5.5	20.8	1.08 ± 0.51

Simple pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	8.5 ± 3.8	1.4	16.0	1.04 ± 0.67
`./fd-flume '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	8.4 ± 3.9	1.8	15.8	1.03 ± 0.68
`./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	8.1 ± 3.8	1.3	15.5	1.00

Simple pattern (-HI)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	11.1 ± 3.4	6.9	21.3	1.00
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	11.8 ± 4.1	6.9	22.8	1.06 ± 0.49
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	13.5 ± 4.2	7.3	24.5	1.21 ± 0.53

File extension

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --extension jpg '' '/home/thayne/dev/haproxy'`	10.5 ± 4.2	4.6	20.7	1.00
`./fd-flume -HI --extension jpg '' '/home/thayne/dev/haproxy'`	11.1 ± 3.9	5.9	21.7	1.05 ± 0.56
`./fd-crossbeam -HI --extension jpg '' '/home/thayne/dev/haproxy'`	11.6 ± 4.3	5.0	20.9	1.10 ± 0.59

File type

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --type l '' '/home/thayne/dev/haproxy'`	12.8 ± 4.1	7.0	22.0	1.04 ± 0.47
`./fd-flume -HI --type l '' '/home/thayne/dev/haproxy'`	12.4 ± 3.9	6.5	22.3	1.00
`./fd-crossbeam -HI --type l '' '/home/thayne/dev/haproxy'`	13.0 ± 4.4	7.3	26.6	1.05 ± 0.49

Cold cache

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	132.1 ± 25.5	118.4	177.7	1.20 ± 0.26
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	110.2 ± 10.6	101.6	134.8	1.00
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/dev/haproxy'`	112.4 ± 3.4	108.0	116.4	1.02 ± 0.10

Rust-lang repository (spinning disk)

`fd` regression benchmark

No pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master --hidden --no-ignore '' '/home/thayne/bulk-home/devel/rust-lang'`	254.5 ± 2.1	251.7	259.7	1.01 ± 0.01
`./fd-flume --hidden --no-ignore '' '/home/thayne/bulk-home/devel/rust-lang'`	264.0 ± 2.4	259.5	267.4	1.05 ± 0.01
`./fd-crossbeam --hidden --no-ignore '' '/home/thayne/bulk-home/devel/rust-lang'`	251.2 ± 2.7	247.3	255.5	1.00

Simple pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	346.4 ± 2.6	341.9	351.0	1.00
`./fd-flume '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	347.9 ± 11.8	337.7	373.0	1.00 ± 0.03
`./fd-crossbeam '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	354.4 ± 8.7	344.8	367.6	1.02 ± 0.03

Simple pattern (-HI)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	245.1 ± 3.4	240.7	253.1	1.00
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	249.9 ± 5.4	246.1	264.5	1.02 ± 0.03
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	246.5 ± 4.8	241.4	256.7	1.01 ± 0.02

File extension

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --extension jpg '' '/home/thayne/bulk-home/devel/rust-lang'`	253.7 ± 4.0	249.3	260.8	1.00
`./fd-flume -HI --extension jpg '' '/home/thayne/bulk-home/devel/rust-lang'`	254.6 ± 2.9	250.5	259.5	1.00 ± 0.02
`./fd-crossbeam -HI --extension jpg '' '/home/thayne/bulk-home/devel/rust-lang'`	254.6 ± 3.5	248.2	261.7	1.00 ± 0.02

File type

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --type l '' '/home/thayne/bulk-home/devel/rust-lang'`	243.7 ± 2.4	239.5	248.0	1.00
`./fd-flume -HI --type l '' '/home/thayne/bulk-home/devel/rust-lang'`	247.4 ± 2.5	245.0	254.2	1.02 ± 0.01
`./fd-crossbeam -HI --type l '' '/home/thayne/bulk-home/devel/rust-lang'`	249.5 ± 2.3	246.7	253.4	1.02 ± 0.01

Cold cache

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	29.077 ± 0.413	28.751	29.542	1.00 ± 0.02
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	28.985 ± 0.155	28.849	29.153	1.00
`./fd-crossbeam -HI '.*[0-9]\.jpg$' '/home/thayne/bulk-home/devel/rust-lang'`	29.034 ± 0.238	28.760	29.183	1.00 ± 0.01

I'm not entirely sure if the differences are do to actual performance differences, or something wiht how I'm running the benchmarks.

Jan 13 '22 07:01 tmccombs

Thank you for looking into this again @tavianator. And thank you for the benchmark results, @tmccombs.

but maybe the reset cache command wasn't working as expected, and the order mattered? Does hyperfine run all the tests for the first command before doing the second, or does it intersperse them?

It does run all the benchmarks for the first command before doing the second. See also https://github.com/sharkdp/hyperfine/issues/21

The general problem with your benchmarks is the large statistical noise. Look at the very first benchmark, for example. A result like 1.03 ± 0.44 for flume (with respect to master) means: flume was 3% slower, but there is a statistical uncertainty of 44 percentage points, i.e. the error is an order of magnitude larger than the measured effect. Maybe hyperfine should come with a big warning in a case like this. That 3% performance benefit result really shouldn't be trusted.

Compare that to the "No pattern" benchmark on my machine (note: this is on a larger folder, to increase signal-to-noise even more):

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master --hidden --no-ignore '' '/home/shark/Informatik/'`	385.0 ± 0.9	383.3	386.2	1.00
`./fd-flume --hidden --no-ignore '' '/home/shark/Informatik/'`	402.1 ± 1.8	400.0	406.2	1.04 ± 0.01

Here, the statistical error (0.01) is much quite a bit slower than the effect we are seeing (0.04).

It's annoying, but it's really important to switch off background processes. Especially the ones that might be reading from / writing to disk. Largest offenders for me are typically: dropbox(!), spotify, the browser.

Please find the full benchmark results from my machine (solid state disk) below. I would summarize them as: there is no statistically significant difference between the master version and the version from this branch, except for the "no pattern" benchmark, where the flume-version is 4% slower (reproducibly).

@tavianator Could you maybe also share your benchmark results?

`fd` regression benchmark

No pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master --hidden --no-ignore '' '/home/shark/Informatik/'`	385.0 ± 0.9	383.3	386.2	1.00
`./fd-flume --hidden --no-ignore '' '/home/shark/Informatik/'`	402.1 ± 1.8	400.0	406.2	1.04 ± 0.01

Simple pattern

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master '.*[0-9]\.jpg$' '/home/shark/Informatik/'`	177.3 ± 1.1	175.9	179.2	1.00 ± 0.01
`./fd-flume '.*[0-9]\.jpg$' '/home/shark/Informatik/'`	176.7 ± 0.9	175.0	179.5	1.00

Simple pattern (-HI)

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/'`	361.7 ± 1.6	359.0	364.2	1.01 ± 0.01
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/'`	358.7 ± 2.1	356.2	361.8	1.00

File extension

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --extension jpg '' '/home/shark/Informatik/'`	392.7 ± 9.0	379.9	406.6	1.01 ± 0.03
`./fd-flume -HI --extension jpg '' '/home/shark/Informatik/'`	388.7 ± 6.6	383.0	400.2	1.00

File type

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`./fd-master -HI --type l '' '/home/shark/Informatik/'`	359.2 ± 0.8	357.9	360.7	1.01 ± 0.01
`./fd-flume -HI --type l '' '/home/shark/Informatik/'`	357.2 ± 2.2	353.8	360.2	1.00

Cold cache

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/'`	3.008 ± 0.017	2.995	3.027	1.00 ± 0.01
`./fd-flume -HI '.*[0-9]\.jpg$' '/home/shark/Informatik/'`	3.007 ± 0.015	2.994	3.023	1.00

Jan 23 '22 19:01 sharkdp

Interesting. The no-pattern case would probably fill up the queues more. This seems to contradict the benchmarks in flume's README. Is it worth bring up with flume and/or crossbeam-channel that you aren't seeing the performance gains that they claim over mpsc?

Jan 26 '22 08:01 tmccombs

Here's my results. Weirdly --extension is 7% slower with flume, but otherwise flume wins by 1-5%.

`fd` regression benchmark

No pattern

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master --hidden --no-ignore '' '/home/tavianator/code/android'`	2.069 ± 0.006	2.058	2.076	1.05 ± 0.01
`./fd-feature --hidden --no-ignore '' '/home/tavianator/code/android'`	1.973 ± 0.024	1.955	2.039	1.00

Simple pattern

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master '.*[0-9]\.jpg$' '/home/tavianator/code/android'`	1.370 ± 0.010	1.358	1.389	1.01 ± 0.01
`./fd-feature '.*[0-9]\.jpg$' '/home/tavianator/code/android'`	1.359 ± 0.003	1.355	1.365	1.00

Simple pattern (-HI)

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android'`	2.046 ± 0.003	2.043	2.051	1.05 ± 0.01
`./fd-feature -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android'`	1.946 ± 0.020	1.933	2.000	1.00

File extension

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master -HI --extension jpg '' '/home/tavianator/code/android'`	1.947 ± 0.006	1.938	1.956	1.00
`./fd-feature -HI --extension jpg '' '/home/tavianator/code/android'`	2.088 ± 0.032	2.065	2.153	1.07 ± 0.02

File type

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master -HI --type l '' '/home/tavianator/code/android'`	2.031 ± 0.004	2.025	2.038	1.05 ± 0.00
`./fd-feature -HI --type l '' '/home/tavianator/code/android'`	1.937 ± 0.005	1.932	1.948	1.00

Cold cache

Command	Mean [s]	Min [s]	Max [s]	Relative
`./fd-master -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android'`	2.178 ± 0.004	2.175	2.183	1.05 ± 0.00
`./fd-feature -HI '.*[0-9]\.jpg$' '/home/tavianator/code/android'`	2.076 ± 0.003	2.074	2.079	1.00

Jan 26 '22 16:01 tavianator

@tavianator @tmccombs How do we proceed with this std-channels vs crossbeam vs flume topic? It feels to me like we need to work on #893 first. What do you think?

Mar 04 '22 07:03 sharkdp

I don't know. I'd like to have a better understanding of why switching to crossbeam-channel or flume doesn't perform as well in some cases. Either of you have much experience profiling rust code?

Mar 11 '22 04:03 tmccombs

I just found this by accident: https://github.com/fereidani/kanal

Oct 17 '22 19:10 sharkdp

@sharkdp Yeah I saw that recently too! Curious to try it out, might be a little early though

Oct 17 '22 19:10 tavianator

I guess we can close this in favor of #1146

Oct 31 '22 20:10 sharkdp

fd fd copied to clipboard

Switch from std::sync::mpsc to flume

For my photos directory (on spinning disk):

fd regression benchmark

No pattern

Simple pattern

Simple pattern (-HI)

File extension

File type

Cold cache

Haproxy repository (SSD) order 1

fd regression benchmark

No pattern

Simple pattern

Simple pattern (-HI)

File extension

File type

Cold cache

haproxy repository (SSD) order 2

fd regression benchmark

No pattern

Simple pattern

Simple pattern (-HI)

File extension

File type

Cold cache

Rust-lang repository (spinning disk)

fd regression benchmark

No pattern

Simple pattern

Simple pattern (-HI)

File extension

File type

Cold cache

fd regression benchmark

No pattern

Simple pattern

Simple pattern (-HI)

File extension

File type

Cold cache

fd regression benchmark

No pattern

Simple pattern

Simple pattern (-HI)

File extension

File type

Cold cache

fd
fd copied to clipboard

`fd` regression benchmark

`fd` regression benchmark

`fd` regression benchmark

`fd` regression benchmark

`fd` regression benchmark

`fd` regression benchmark