aptly icon indicating copy to clipboard operation
aptly copied to clipboard

Improve publish performance, especially for prefixes with a large number of snapshots

Open refi64 opened this issue 9 months ago • 4 comments

This contains a variety of improvements for publish performance, specifically speeding up the cleanup operation by:

  • using a separate, faster package to walk the filesystem
  • avoiding re-loading the same packages repeatedly (this will happen if a single prefix has a large number of snapshots or repos that share most of their packages)
  • slightly improving the package list loading performance w/ zero-copy deserialization

Benchmarks were added for all of these, and some unit tests were added specifically to test aspects of cleanup.

We have a relatively large aptly repository with >90 repositories, ~207k packages across all of the repositories, and >3.5k snapshots; a testing version of that repository was used to measure the publishing performance. Prior to these changes, publishing took >9 minutes, with over 8 minutes of that time just in the cleanup phase. With these, the cleanup time goes down to ~13 seconds, for a total publish time of a little under a minute.

Checklist

  • [x] unit-test added (if change is algorithm)
  • [ ] functional test added/updated (if change is functional)
  • [ ] man page updated (if applicable)
  • [ ] bash completion updated (if applicable)
  • [ ] documentation updated
  • [ ] author name in AUTHORS (I already have a commit for this in #1220 and don't want merge conflicts)

refi64 avatar Sep 25 '23 22:09 refi64

Codecov Report

Attention: Patch coverage is 83.72093% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 66.06%. Comparing base (f1649a6) to head (ae09dfc). Report is 40 commits behind head on master.

:exclamation: Current head ae09dfc differs from pull request most recent head 75b858b. Consider uploading reports for the commit 75b858b to get more accurate results

Files Patch % Lines
api/files.go 0.00% 8 Missing :warning:
deb/publish.go 79.31% 4 Missing and 2 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1222      +/-   ##
==========================================
- Coverage   74.85%   66.06%   -8.80%     
==========================================
  Files         143      143              
  Lines       16187    16215      +28     
==========================================
- Hits        12117    10712    -1405     
- Misses       3134     4752    +1618     
+ Partials      936      751     -185     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Sep 26 '23 14:09 codecov[bot]

I think this is a fantastic change!

@neolynx @dario-gallucci you wanna have a look at this?

randombenj avatar Nov 08 '23 18:11 randombenj

I think this is a fantastic change!

yes indeed,thanks !

@neolynx @dario-gallucci you wanna have a look at this?

sure!

neolynx avatar Nov 08 '23 18:11 neolynx

As an aside, we've been running this on our production infra for a bit now, and nothing has exploded, which is probably a good thing.

refi64 avatar Dec 07 '23 16:12 refi64