conan
conan copied to clipboard
[bug] Slow package resolution from local cache when remotes are defined
Context:
Recently we're having issues with our network infrastructure and our internal Conan remotes. In general it takes up to 5 seconds for a small package to get downloaded. We have a CI job which installs a Conan package and then does a number of conan info and conan inspect on it.
Issue: Because of this we found the issue that even though packages are available in the local Conan cache it takes a long time to access them when remotes are defined.
Expected:
- First
conan install A/1.0.0@user/channeltakes 5sec because of the slow remotes - Further calls to
A/1.0.0@user/channelare very fast because the package is already available in the local Conan Cache
Actual experience:
- First
conan install A/1.0.0@user/channeltakes 5sec because of the slow remotes - Further calls take very long as well, even when the final package is taken from local Conan cache anyways
- We do not define -U parameters anywhere, so it should not query any data from the remotes
Some strange side effects:
- This happens when there are any remotes configured at all. Doesn't matter if they're all deactivated as well.
- This does not happen when there are no remotes configured. Then the package is taken from the local Conan cache in no time.
Environment Details (include every applicable attribute)
- Operating System+version: Dockerized build-environment based on Linux Suse 12.2 and 13.1
- Compiler+version: -
- Conan version: 1.21.0
- Python version: Python 3.7
Steps to reproduce (Include if Applicable)
Repeat
time conan info <ref>
a few times. Note the execution times.
Then do a
conan remote clean
Repeat
time conan info <ref>
a few times. Note the execution times, which are now faster than with remotes defined.
Redefine the remote(s)
and do the
time conan info <ref>
again. Times are now slower again.
Times are still slow, when all remotes are disabled via
'conan remote disable
Hi, @fourbft, just a question: is something you have started to experience with 1.21.0 or is it there for all the versions? It looks like we are calling the remotes even though they are not needed for the operation (as you report, your use-case works the same without the remotes).
Something to look into, for sure.
We only began noticing this when our remotes slowed down and the issue jumped in our eye.
Ok, thanks! So it should be Conan iterating and calling the remotes for nothing... maybe activating the trace_file is an easy way to know if we are doing any HTTP call.
Here's what I've found so far:
- Emptied my local Conan Cache with
conan remove -f * - Added conan-center as my only remote
- Running
conan info bzip2/1.0.8@conan/stablewill download the package recipe, but no binaries - Running subsequent
conan info bzip2/1.0.8@conan/stablewill take time, as Conan will query the remote (even if none is explicitly defined by the user) to query the binary data. This happens in graph_binary.py - Running
conan install bzip2/1.0.8@conan/stablewill install the binary into the local Cache - Running
conan info bzip2/1.0.8@conan/stablenow will not query any remotes, as the needed binary data is already available - Running the above without any remotes defined will simply skip querying the remotes for binary data.
The question now is if it is really necessary to query the binary data when no local data is available. The code within graph_binary.py could simply check remotes.selected instead of iterating through all remotes regardless of what the user selected.
I've created a PR with the necessary changes. As the code I changed is used in a lot of other parts of Conan I have no idea if there's something breaking.
It works for conan info though :D
To copy what's been discussed in the PR:
The "easy fix" is not solving the issue. It works for conan info then, but many other parts of Conan won't work anymore.
The question is if conan info really needs to gather info about remote binaries or not.
- If it does, perhaps there could be a parameter to control whether to query remotes or not
- If it doesn't perhaps it's possible to internally avoid querying the remotes for binary data
This is also relevant to our use-case. We'd like to be able to query the state of packages in the local cache and ignore remotes.
So ideally, there would be some kind of special remote that is the local cache:
conan info -r _LOCAL_CACHE_
Or a flag that tells Conan to skip remotes for this command. Or something to that effect.
Hi @cassava
You can use conan remote disable * to disable remotes.
In any case please note:
conan infowill only hit the cache if the packages are already there, but will not hit the servers- If the packages are not in the local cache, then it needs hitting the servers, otherwise, the command will fail as it will not be able to resolve the graph
So there must be something else that you are seeing, but this command should be very fast if the packages are already in the cache.
Proposing https://github.com/conan-io/conan/pull/12808 to force the temporal disabling of remotes (that doesn't invalidate https://github.com/conan-io/conan/pull/12807, if servers need to be contacted because packages not in the cache, and using --no-remote, it will fail to resolve)