sui icon indicating copy to clipboard operation
sui copied to clipboard

[Package Upgrades] Avoid repeatedly fetching dependencies

Open amnn opened this issue 2 years ago • 4 comments

Currently, the move toolchain will fetch a dependency once to discover its own dependencies during resolution, and then again to ensure it has an up-to-date version of its source. It performs the second fetch, because some packages may not have been fetched during resolution, if their details were supplied by an external resolver, but it means that packages that were resolved this way go through a redundant second fetch.

This issue occurs because fetching is implemented as a stateless function, which does not remember if a package was previously fetched as part of resolution: download_and_update_if_remote.

So this can be fixed by making this component stateful -- an in-memory cache that remembers the packages it has already fetched as part of a given build to avoid re-fetching.

amnn avatar Feb 16 '23 12:02 amnn

@amnn, the download_and_update_if_remove function attempts to check if a given package (at a given version) has already been fetched and is available in the respective directory. It does require running some git commands which is presumably pretty slow - is the idea here to bypass running git commands altogether by caching enough info about fetched packages and their versions? Or is the exiting mechanism not quite working for our use case?

awelc avatar Feb 17 '23 21:02 awelc

Right, the existing mechanism works but perfoms a redundant git fetch the second time (which doesn't fully download the repo again, but it does do some work, and also results in a confusing output to the user who sees the same package being fetched multiple times).

amnn avatar Feb 18 '23 01:02 amnn

In other words, the idea is to bypass the download_and_update_if_remote altogether if the info about the already loaded package is available and in the cache?

awelc avatar Feb 18 '23 03:02 awelc

Yes, implementation-wise I was imagining replacing the function with a stateful cache which is essentially just a set, plus this function. Only fetch the dependency and add it to the set if its not already in there. Pass this cache to the dependency graph and resolution processes.

amnn avatar Feb 18 '23 08:02 amnn