Scoop icon indicating copy to clipboard operation
Scoop copied to clipboard

Improve local search performance

Open bartvanandel opened this issue 2 years ago • 9 comments

Description

Improve local search performance by pruning candidates using git ls-files and git grep. This drastically reduces the number of files that are read into memory to find the desired results.

The code accounts for buckets that are not maintained using git. Not sure if that really applies, but for buckets that are not maintained using git, the code will fall back to the previous approach.

Motivation and Context

Closes #4239

How Has This Been Tested?

Compared the output of the previous search method with the new method. The code uses standard PowerShell and git functions that have existed for a long time.

Checklist:

  • [x] I have read the Contributing Guide.
  • [ ] ~I have updated the documentation accordingly.~ N/A. Well, I sparsely documented the approach in code.
  • [ ] ~I have updated the tests accordingly.~ N/A, there were no pre-existing tests.

bartvanandel avatar Mar 16 '22 12:03 bartvanandel

Sorry, but this approach took longer time in my PC:

image

image

And I've tried several times. What about yours @rashil2000 ?

niheaven avatar Mar 17 '22 06:03 niheaven

Strange. On my system, scoop search net went down from 32 seconds to 4 seconds with this change. Likewise, scoop search zulu went from 8 seconds to 2 seconds (this contains hits in the java bucket).

BTW due to (presumably) file system caching, next runs of the same command and on the same develop branch are significantly faster. So measuring this in a fair way may be tricky, especially since in practice, usually you don't search the same thing twice in a row.

bartvanandel avatar Mar 17 '22 07:03 bartvanandel

I'm seeing a marginal improvement.

image

image


image

image

rashil2000 avatar Mar 17 '22 13:03 rashil2000

Search twice or use different methods?

Same here, original method 8sec, this 13sec (scoop search rstudio)

Buckets list:

Name       Source                                             Updated            Manifests
----       ------                                             -------            ---------
dorado     https://github.com/h404bi/dorado                   2022/3/18 8:10:29        221
extras     https://github.com/ScoopInstaller/Extras           2022/3/18 8:33:48       1437
java       https://github.com/ScoopInstaller/Java             2022/3/17 20:26:07       220
main       https://github.com/ScoopInstaller/Main             2022/3/18 8:32:58        992
nerd-fonts https://github.com/matthewjberger/scoop-nerd-fonts 2022/3/16 3:11:59        187
nih        https://github.com/niheaven/scoop-nih.git          2022/3/18 9:53:30         28
rasa       https://github.com/rasa/scoops                     2022/3/15 5:01:36         70
tests      https://github.com/ScoopInstaller/Tests            2022/3/17 4:32:31         62
versions   https://github.com/ScoopInstaller/Versions         2022/3/18 9:55:40        280

niheaven avatar Mar 18 '22 02:03 niheaven

Search twice or use different methods?

Different methods. See .\bin\scoop search zulu vs scoop search zulu

rashil2000 avatar Mar 18 '22 03:03 rashil2000

More feedback is needed, IMO. Don't know why this one took more time.

niheaven avatar Mar 18 '22 08:03 niheaven

DESKTOP in current on  develop
❯ scoop bucket list

Name     Source                                     Updated            Manifests
----     ------                                     -------            ---------
akira    https://github.com/chawyehsu/scoop-akira   2022/4/13 23:52:50         2
dorado   https://github.com/chawyehsu/dorado        2022/5/13 17:08:19       217
extras   https://github.com/ScoopInstaller/Extras   2022/5/16 12:30:48      1501
java     https://github.com/ScoopInstaller/Java     2022/5/15 4:01:48        220
main     https://github.com/ScoopInstaller/Main     2022/5/16 0:30:16       1019
versions https://github.com/ScoopInstaller/Versions 2022/5/16 15:36:44       314

DESKTOP in current on  develop
❯ Measure-Command { .\bin\scoop.ps1 search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 6
Milliseconds      : 911
Ticks             : 69116116
TotalDays         : 7.99955046296296E-05
TotalHours        : 0.00191989211111111
TotalMinutes      : 0.115193526666667
TotalSeconds      : 6.9116116
TotalMilliseconds : 6911.6116


DESKTOP in current on  develop took 6s
❯ git sw fix/4239_searchPerformance
Switched to branch 'fix/4239_searchPerformance'
DESKTOP in current on  fix/4239_searchPerformance
❯ Measure-Command { .\bin\scoop.ps1 search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 1
Milliseconds      : 320
Ticks             : 13204317
TotalDays         : 1.52827743055556E-05
TotalHours        : 0.000366786583333333
TotalMinutes      : 0.022007195
TotalSeconds      : 1.3204317
TotalMilliseconds : 1320.4317


chawyehsu avatar May 16 '22 11:05 chawyehsu

Test it again, same result:

Scoop\apps\scoop took 23s
❯ git -C "current" checkout develop
Switched to branch 'develop'
Your branch is behind 'origin/develop' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Scoop\apps\scoop
❯ Measure-Command { scoop search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 10
Milliseconds      : 352
Ticks             : 103529888
TotalDays         : 0.000119826259259259
TotalHours        : 0.00287583022222222
TotalMinutes      : 0.172549813333333
TotalSeconds      : 10.3529888
TotalMilliseconds : 10352.9888



Scoop\apps\scoop took 10s
❯ git -C "current" checkout fix/4239_searchPerformance
Switched to branch 'fix/4239_searchPerformance'

Scoop\apps\scoop
❯ Measure-Command { scoop search rstudio }
'extras' bucket:
'versions' bucket:

Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 23
Milliseconds      : 881
Ticks             : 238818129
TotalDays         : 0.000276409871527778
TotalHours        : 0.00663383691666667
TotalMinutes      : 0.398030215
TotalSeconds      : 23.8818129
TotalMilliseconds : 23881.8129

Really strange...

niheaven avatar May 17 '22 09:05 niheaven

Well, if the results are this unpredictable / unreliable, I'd suggest to not merge at the moment. I don't currently have the time to dig any further.

bartvanandel avatar May 17 '22 10:05 bartvanandel

Closing in favour of #5644

rashil2000 avatar Oct 02 '23 12:10 rashil2000