not sure if this does what it should be doing but tests are passing

fixes #112

Feb 04 '22 07:02 thecotne

Sweet Could you please add a BENCHMARKING.md file like https://github.com/uutils/coreutils/blob/main/src/uu/sort/BENCHMARKING.md with some examples?

Did you see some performance wins?

Feb 04 '22 08:02 sylvestre

Codecov Report

Merging #142 (836d06c) into main (58070d6) will increase coverage by 0.13%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #142      +/-   ##
==========================================
+ Coverage   52.98%   53.12%   +0.13%     
==========================================
  Files          24       24              
  Lines        4756     4757       +1     
  Branches     1548     1553       +5     
==========================================
+ Hits         2520     2527       +7     
+ Misses       1687     1675      -12     
- Partials      549      555       +6

Impacted Files	Coverage Δ
src/find/matchers/mod.rs	`62.54% <ø> (ø)`
src/find/mod.rs	`63.33% <100.00%> (+1.53%)`	:arrow_up:
src/xargs/mod.rs	`77.46% <0.00%> (-2.08%)`	:arrow_down:
src/lib.rs	`36.36% <0.00%> (+0.49%)`	:arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 58070d6...836d06c. Read the comment docs.

Feb 04 '22 08:02 codecov[bot]

Seems that it broke some BFS tests:

2022-02-04T08:16:27.7485162Z /home/runner/work/findutils/findutils/findutils/target/release/find basic \( -name foo -type d -o -name bar -a -type f \) -print \, \! -empty -type f -print 
2022-02-04T08:16:27.7548222Z test_print0
2022-02-04T08:16:27.7552535Z /home/runner/work/findutils/findutils/findutils/target/release/find basic/a basic/b -print0 
2022-02-04T08:16:27.7567667Z thread '<unnamed>' panicked at 'already borrowed: BorrowMutError', src/find/matchers/printer.rs:42:52
2022-02-04T08:16:27.7568120Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2022-02-04T08:16:27.7585129Z test_print0 failed!

Feb 04 '22 08:02 sylvestre

i ran some benchmarks and this branch is slightly slower then main and both are faster then gnu find

$ hyperfine --warmup 3 'find ~ -name yarn.lock' './find ~ -name yarn.lock' './find-par ~ -name yarn.lock'
Benchmark 1: find ~ -name yarn.lock
  Time (mean ± σ):     798.2 ms ±   9.0 ms    [User: 332.7 ms, System: 465.3 ms]
  Range (min … max):   783.5 ms … 809.0 ms    10 runs

Benchmark 2: ./find ~ -name yarn.lock
  Time (mean ± σ):     608.6 ms ±  11.1 ms    [User: 236.1 ms, System: 372.5 ms]
  Range (min … max):   590.4 ms … 621.0 ms    10 runs

Benchmark 3: ./find-par ~ -name yarn.lock
  Time (mean ± σ):     618.9 ms ±   7.5 ms    [User: 224.5 ms, System: 395.0 ms]
  Range (min … max):   606.6 ms … 629.4 ms    10 runs

Summary
  './find ~ -name yarn.lock' ran
    1.02 ± 0.02 times faster than './find-par ~ -name yarn.lock'
    1.31 ± 0.03 times faster than 'find ~ -name yarn.lock'

$ hyperfine --warmup 3 'find ~ -name *.json' './find ~ -name *.json' './find-par ~ -name *.json'
Benchmark 1: find ~ -name *.json
  Time (mean ± σ):     836.2 ms ±   6.4 ms    [User: 374.1 ms, System: 461.8 ms]
  Range (min … max):   826.4 ms … 847.3 ms    10 runs

Benchmark 2: ./find ~ -name *.json
  Time (mean ± σ):     723.5 ms ±   9.4 ms    [User: 313.4 ms, System: 410.1 ms]
  Range (min … max):   710.3 ms … 735.8 ms    10 runs

Benchmark 3: ./find-par ~ -name *.json
  Time (mean ± σ):     741.5 ms ±   5.4 ms    [User: 339.3 ms, System: 402.8 ms]
  Range (min … max):   735.1 ms … 752.0 ms    10 runs

Summary
  './find ~ -name *.json' ran
    1.02 ± 0.02 times faster than './find-par ~ -name *.json'
    1.16 ± 0.02 times faster than 'find ~ -name *.json'

./find is build from main branch
./find-par is build from this branch
and find is gnu find

Feb 04 '22 11:02 thecotne

slower, even on directories with plenty of files?

Feb 04 '22 14:02 sylvestre

730550 files to be exact

Feb 04 '22 14:02 thecotne

Spinning rust, or SSD. I could see how on the former, parallel operations could end up just fighting over head seeks.

Mark

On Fri, Feb 4, 2022 at 2:24 PM Tsotne Nazarashvili @.***> wrote:

730550 files to be exact

— Reply to this email directly, view it on GitHub https://github.com/uutils/findutils/pull/142#issuecomment-1030035001, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3MQF7GOHRUG5WPX76ABZ3UZPOQZANCNFSM5NRBOYTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

Feb 04 '22 15:02 mcharsley

i am testing on SSD

Feb 05 '22 01:02 thecotne

ok, so, is there an advantage merging this ? (besides the fun of it ? :)

Feb 05 '22 13:02 sylvestre

no. in current state this branch is slightly more complex and slightly less efficient version so there is no point in merging it.

if you can give any advice on possible problems/solutions and/or what to investigate etc. to actually improve performance using parallelism i can try to do it. if not we can close this PR i guess

Feb 05 '22 13:02 thecotne

I believe there's no performance benefit because with walkdir, all the filesystem traversal happens on a single thread. par_iter() then splits up the processing work between multiple threads, but the bottleneck is the traversal which remains single-threaded.

You could try the ignore crate which natively supports parallel traversal.

Feb 06 '22 22:02 tavianator

findutils
findutils copied to clipboard

Leverage parallel capabilities

Codecov Report

findutils findutils copied to clipboard

Leverage parallel capabilities

Codecov Report

findutils
findutils copied to clipboard