winget-cli icon indicating copy to clipboard operation
winget-cli copied to clipboard

Add a rebuild cache

Open Trenly opened this issue 1 year ago • 3 comments

Description of the new feature / enhancement

When rebuilding the entire index, it takes a long time as each manifest must be fully parsed and rebuilt. However, many of these manifests may not have changed since the last time a rebuild was run. With nearly 60,000 manifests, it would be beneficial to have some method of doing a partial rebuild.

Proposed technical implementation details

When a rebuild is performed, a copy of the manifests and the indexes could be saved off to a storage blob as a gzip. When the next rebuild is performed, this gzip could be downloaded and expanded, and the indexes loaded into memory as if it were the publishing pipeline. Then, instead of rebuilding the index from scratch, each manifest could be compared. If the manifest has changed, then update the index based upon the diff from the old manifest file to the new manifest file. If there was no change in the manifest, the index does not need to be updated. Once all the manifests have been processed, the new indexes can be published and a copy of the manifests and indexes can be saved off as the cache for the next rebuild.

Of course the pipelines will still need to have an option to perform a full rebuild, if necessary, but adding a caching layer could significantly reduce the amount of time it takes by starting from the last known-good index.

With this caching strategy, it could also be beneficial to perform a rebuild on a regular cadence (every 3 months?) to help ensure a well-maintained cache.

Trenly avatar Oct 10 '24 04:10 Trenly

This seems to cover the pipelines, not the CLI application - should it be moved to winget-pkgs?

stephengillie avatar Oct 10 '24 17:10 stephengillie

I'll leave that up to @denelon, but considering that the index creation is part of the CLI implementation, I had opted to put it here, mostly for planning purposes within the team; Especially since the rebuild pipeline isn't typically run as a regular part of verification/publishing

Trenly avatar Oct 10 '24 17:10 Trenly

I'll let the engineering team take a look to see if this is beneficial, and if it should be here or at winget-pkgs. 😊

denelon avatar Oct 10 '24 21:10 denelon