Scoop
Scoop copied to clipboard
Improve local search performance
Description
Improve local search performance by pruning candidates using git ls-files
and git grep
. This drastically reduces the number of files that are read into memory to find the desired results.
The code accounts for buckets that are not maintained using git
. Not sure if that really applies, but for buckets that are not maintained using git
, the code will fall back to the previous approach.
Motivation and Context
Closes #4239
How Has This Been Tested?
Compared the output of the previous search method with the new method. The code uses standard PowerShell and git functions that have existed for a long time.
Checklist:
- [x] I have read the Contributing Guide.
- [ ] ~I have updated the documentation accordingly.~ N/A. Well, I sparsely documented the approach in code.
- [ ] ~I have updated the tests accordingly.~ N/A, there were no pre-existing tests.
Sorry, but this approach took longer time in my PC:
And I've tried several times. What about yours @rashil2000 ?
Strange. On my system, scoop search net
went down from 32 seconds to 4 seconds with this change. Likewise, scoop search zulu
went from 8 seconds to 2 seconds (this contains hits in the java bucket).
BTW due to (presumably) file system caching, next runs of the same command and on the same develop branch are significantly faster. So measuring this in a fair way may be tricky, especially since in practice, usually you don't search the same thing twice in a row.
I'm seeing a marginal improvement.
Search twice or use different methods?
Same here, original method 8sec, this 13sec (scoop search rstudio
)
Buckets list:
Name Source Updated Manifests
---- ------ ------- ---------
dorado https://github.com/h404bi/dorado 2022/3/18 8:10:29 221
extras https://github.com/ScoopInstaller/Extras 2022/3/18 8:33:48 1437
java https://github.com/ScoopInstaller/Java 2022/3/17 20:26:07 220
main https://github.com/ScoopInstaller/Main 2022/3/18 8:32:58 992
nerd-fonts https://github.com/matthewjberger/scoop-nerd-fonts 2022/3/16 3:11:59 187
nih https://github.com/niheaven/scoop-nih.git 2022/3/18 9:53:30 28
rasa https://github.com/rasa/scoops 2022/3/15 5:01:36 70
tests https://github.com/ScoopInstaller/Tests 2022/3/17 4:32:31 62
versions https://github.com/ScoopInstaller/Versions 2022/3/18 9:55:40 280
Search twice or use different methods?
Different methods. See .\bin\scoop search zulu
vs scoop search zulu
More feedback is needed, IMO. Don't know why this one took more time.
DESKTOP in current on develop
❯ scoop bucket list
Name Source Updated Manifests
---- ------ ------- ---------
akira https://github.com/chawyehsu/scoop-akira 2022/4/13 23:52:50 2
dorado https://github.com/chawyehsu/dorado 2022/5/13 17:08:19 217
extras https://github.com/ScoopInstaller/Extras 2022/5/16 12:30:48 1501
java https://github.com/ScoopInstaller/Java 2022/5/15 4:01:48 220
main https://github.com/ScoopInstaller/Main 2022/5/16 0:30:16 1019
versions https://github.com/ScoopInstaller/Versions 2022/5/16 15:36:44 314
DESKTOP in current on develop
❯ Measure-Command { .\bin\scoop.ps1 search rstudio }
'extras' bucket:
'versions' bucket:
Days : 0
Hours : 0
Minutes : 0
Seconds : 6
Milliseconds : 911
Ticks : 69116116
TotalDays : 7.99955046296296E-05
TotalHours : 0.00191989211111111
TotalMinutes : 0.115193526666667
TotalSeconds : 6.9116116
TotalMilliseconds : 6911.6116
DESKTOP in current on develop took 6s
❯ git sw fix/4239_searchPerformance
Switched to branch 'fix/4239_searchPerformance'
DESKTOP in current on fix/4239_searchPerformance
❯ Measure-Command { .\bin\scoop.ps1 search rstudio }
'extras' bucket:
'versions' bucket:
Days : 0
Hours : 0
Minutes : 0
Seconds : 1
Milliseconds : 320
Ticks : 13204317
TotalDays : 1.52827743055556E-05
TotalHours : 0.000366786583333333
TotalMinutes : 0.022007195
TotalSeconds : 1.3204317
TotalMilliseconds : 1320.4317
Test it again, same result:
Scoop\apps\scoop took 23s
❯ git -C "current" checkout develop
Switched to branch 'develop'
Your branch is behind 'origin/develop' by 1 commit, and can be fast-forwarded.
(use "git pull" to update your local branch)
Scoop\apps\scoop
❯ Measure-Command { scoop search rstudio }
'extras' bucket:
'versions' bucket:
Days : 0
Hours : 0
Minutes : 0
Seconds : 10
Milliseconds : 352
Ticks : 103529888
TotalDays : 0.000119826259259259
TotalHours : 0.00287583022222222
TotalMinutes : 0.172549813333333
TotalSeconds : 10.3529888
TotalMilliseconds : 10352.9888
Scoop\apps\scoop took 10s
❯ git -C "current" checkout fix/4239_searchPerformance
Switched to branch 'fix/4239_searchPerformance'
Scoop\apps\scoop
❯ Measure-Command { scoop search rstudio }
'extras' bucket:
'versions' bucket:
Days : 0
Hours : 0
Minutes : 0
Seconds : 23
Milliseconds : 881
Ticks : 238818129
TotalDays : 0.000276409871527778
TotalHours : 0.00663383691666667
TotalMinutes : 0.398030215
TotalSeconds : 23.8818129
TotalMilliseconds : 23881.8129
Really strange...
Well, if the results are this unpredictable / unreliable, I'd suggest to not merge at the moment. I don't currently have the time to dig any further.
Closing in favour of #5644