Speed up
Whilst working on my other PR I found --tree to be quite slow compared to tree and so analysed cargo flamegraph results resulting in reducing unnecessary allocations and system calls, producing a significant speed gain for the --tree and -R options.
Original flamegraph: (--tree)

After optimisations: (--tree)

After optimisations: (-R)

Originally, the main bottleneck was system calls in order to retrieve uid and gid properties and the following processing into strings when the -R and --tree options do not need this information so now on unix, the information is lazy loaded.
Now, in the optimised --tree graph, we see the main bottleneck now is creating the rest of the Meta struct and that the sorting and display isn't too shabby. In the optimised -R graph, we see the displaying and sorting of information into a grid is the biggest bottleneck for this option. This would indicate for future optimisations, some information and processing may need to be lazy loaded for the Meta struct or perhaps simply the algorithms need optimising, on top of improving the grid/tree display performance.
Nevertheless, overall this PR has managed to make the following speed gains, tested only on my linux x86_64 machine with hyperfine, compared with native equivalents ls -R and tree:
~ 5 000 directories, 56 000 files, 27G
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
lsd --tree |
2.374 ± 0.016 | 2.350 | 2.400 | 6.89 ± 0.15 |
lsd --tree (opt.) |
0.537 ± 0.007 | 0.528 | 0.550 | 1.56 ± 0.04 |
tree |
0.345 ± 0.007 | 0.338 | 0.360 | 1.00 |
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
lsd -R |
2.495 ± 0.027 | 2.458 | 2.535 | 19.77 ± 0.33 |
lsd -R (opt.) |
0.649 ± 0.011 | 0.639 | 0.668 | 5.14 ± 0.11 |
ls -R |
0.126 ± 0.002 | 0.123 | 0.130 | 1.00 |
Codecov Report
Merging #441 (3224f87) into master (7254816) will decrease coverage by
0.19%. The diff coverage is73.10%.
@@ Coverage Diff @@
## master #441 +/- ##
==========================================
- Coverage 88.48% 88.29% -0.20%
==========================================
Files 36 36
Lines 3535 3289 -246
==========================================
- Hits 3128 2904 -224
+ Misses 407 385 -22
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/core.rs | 0.00% <0.00%> (ø) |
|
| src/main.rs | 9.09% <0.00%> (-0.91%) |
:arrow_down: |
| src/meta/permissions.rs | 86.95% <ø> (+3.62%) |
:arrow_up: |
| src/meta/mod.rs | 38.79% <29.62%> (-0.69%) |
:arrow_down: |
| src/meta/inode.rs | 66.66% <40.00%> (-2.09%) |
:arrow_down: |
| src/flags.rs | 91.66% <50.00%> (-8.34%) |
:arrow_down: |
| src/meta/symlink.rs | 88.88% <63.63%> (+19.50%) |
:arrow_up: |
| src/display.rs | 75.73% <68.88%> (+2.80%) |
:arrow_up: |
| src/meta/owner.rs | 61.11% <75.00%> (-13.89%) |
:arrow_down: |
| src/meta/name.rs | 94.14% <80.88%> (+0.36%) |
:arrow_up: |
| ... and 8 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 7254816...3224f87. Read the comment docs.
What do you think about jwalk?
What do you think about jwalk?
I tried to re-implement it with walkdir and it is ideal for walking directory trees whereas in lsd we are mostly listing the contents of a single given directory, not the tree unless -R or --tree. When I benchmarked, the original lsd implementation was faster which I assume is due to the overhead of setting up threads and preparing to walk the directories in walkdir even though typically we only need to get the entries in 1 directory.
If jwalk is anything like this I would presume it would have the same effect - just adding overhead. At the moment, the walking and filtering is rather efficient since we only descend into directories if needed. I think the most useful speed gains now would be the general workings of the Meta struct and its values (trying to avoid allocations), and the displaying of entries into a grid/tree.
What do you think about jwalk?
Upon looking into it I will see if it helps specifically for these recursive descents I'll try it
@0jdxt Do you have planed to finish this PR and resolve the merge conflicts?
@0jdxt Do you have planed to finish this PR and resolve the merge conflicts?
Hi! I've been busy so feel free. It's just such a mess I feel i should be responsible for cleaning it up. Alas, this PR could be merged soon, I'm just a bit concerned on how I've changed the flow of the program a bit and the structure of the program is getting quite complex. Either way the repo needs a tidy up at least with some comments describing the flow of the program so its easier to follow, especially for those who want to contribute to the repo and is reading the code for the first time.
Hey @0jdxt , you will have to rebase on master instead of merging from master.
summary: Some code is merely tidied. Replaced String of Name with Cow, reducing unnecessary allocs. Meta keeps raw Metadata to avoid second syscall for the metadata during colouring/rendering. Keep Owner as u64 until rendered. Change icons to chars and use the faster FxHashMap, delegates separator flag to Name::render. Name::escape function was O(2n), reduced to O(n). Name::Ord now uses to_ascii_lowercase instead, increasing speed of comparisons.
Hey @0jdxt , please don't squash and force push every change. It makes it harder to review just the changes.
Hey @0jdxt , please don't squash and force push every change. It makes it harder to review just the changes.
You can click on force pushed to see an interdiff.
https://github.com/Peltoche/lsd/compare/04bb7141df9226a774cb1cba91977514d20b3625..0f3eb3e459267d78207e3cfa8eca8b23b81a3fb0
Hey @0jdxt , please don't squash and force push every change. It makes it harder to review just the changes.
Hi sorry still not used to this PR stuff
@0jdxt Had to merge in https://github.com/Peltoche/lsd/pull/489 . You will have to rebase your PR on top of current master. This looks mostly good to go otherwise.
@0jdxt Had to merge in #489 . You will have to rebase your PR on top of current master. This looks mostly good to go otherwise.
Rebased! should be good to go
Codecov Report
Merging #441 (38a6933) into master (b6bd1d3) will increase coverage by
0.44%. The diff coverage is63.06%.
@@ Coverage Diff @@
## master #441 +/- ##
==========================================
+ Coverage 80.71% 81.16% +0.44%
==========================================
Files 35 35
Lines 3449 3413 -36
==========================================
- Hits 2784 2770 -14
+ Misses 665 643 -22
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/core.rs | 0.00% <0.00%> (ø) |
|
| src/display.rs | 25.26% <0.00%> (-0.23%) |
:arrow_down: |
| src/main.rs | 10.00% <ø> (+10.00%) |
:arrow_up: |
| src/meta/permissions.rs | 36.95% <ø> (+1.53%) |
:arrow_up: |
| src/meta/owner.rs | 16.66% <15.38%> (-25.01%) |
:arrow_down: |
| src/meta/mod.rs | 18.81% <18.18%> (+0.88%) |
:arrow_up: |
| src/meta/symlink.rs | 90.24% <50.00%> (+20.43%) |
:arrow_up: |
| src/app.rs | 92.15% <57.14%> (-3.93%) |
:arrow_down: |
| src/meta/name.rs | 94.85% <75.55%> (+1.05%) |
:arrow_up: |
| src/meta/date.rs | 98.79% <95.45%> (+5.53%) |
:arrow_up: |
| ... and 15 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update b6bd1d3...38a6933. Read the comment docs.
Hi @0jdxt, thanks so much for working for this massive PR!
I want to try to make this PR fire just now, but I found it mixed up with several functionality changes, and some of it break the original function.
for example:
Cargo.lockshould not be added in.gitignore, check explains heretermsizewhich used in the PR checking whether it is tty failed in pipeline
how about break down the PR into several ones which contains one functionality only and we can make sure it work as expected, also we can do some quicker reviews and merges.
I'm closing this PR for now as it has deviated quite a bit from master, but let me know if you are still interested in this and we can open it back up.