conan
conan copied to clipboard
[Feature] Conan 2 cache concurrency
This ticket is to gather and centralize all the related tickets. There are 3 different aspects to make the cache concurrent:
-
conan config install
and other commands that can change the Conan home configuration concurrently - Package installation and building concurrently by different processes
- Better management of "dirty" status to recover gently from crashes and interrupted operations (already a feature, but requires refactoring and alignment with the concurrency)
Out of the scope:
- Concurrency at multi-machine or multi-OS level. The planned supported concurrency is at the machine level, from the same OS. OS sync mechanism file-locks will be used, and that is only guaranteed to work at the OS level, no shared or mounted drives
- Sharing the same cache among different Conan versions. The same Conan version must be used for the same cache.
Conan config install concurrent:
- https://github.com/conan-io/conan/issues/6233
- https://github.com/conan-io/conan/issues/15018
Package concurrency
- https://github.com/conan-io/conan/issues/14570
- https://github.com/conan-io/conan/issues/11033
- https://github.com/conan-io/conan/issues/8241
- https://github.com/conan-io/conan/issues/6753
- https://github.com/conan-io/conan/issues/4648
- https://github.com/conan-io/conan/issues/5505
- https://github.com/conan-io/conan/issues/7181
I was told to comment here for posterity:
My team has a large repo that consists of a "core" package and hundreds of components that are being packaged up as individual conan packages (with a dependency on the "core" package.) Previously in conan1, our build script that called conan create
in parallel for all of the component packages worked flawlessly (build times of around 3 minutes on our CI boxes) however upon switching to conan2 I noticed that often the conan create
command would error due to the cache concurrency issues. To solve this I have had to run our script in serial mode which massively inflates our build time to approximately 25 minutes (these packages are header-only so in theory we are IO-bound anyways but it appears we are still able to massively benefit from parallelization.) Given that this behavior of building multiple conan packages in parallel seemed to work in conan1, it would be great if the cache concurrency could be improved here in conan2.
Thanks for the feedback @darakelian, this is something that is planned, but it might take a little bit.
In the meantime, you might want to try other approaches to speed up things. For example, you can run things that can run in parallel in different CONAN_HOME
folders, then accumulate packages in the same one with conan cache save/restore
. Avoiding extra downloads can be done by sharing the "download cache folder", which is concurrent and can be shared among other parallel Conan home folders. For building a dependency graph in parallel, the conan graph build-order
will give a list of lists of things that can be safely build in parallel too.
Since the build phases themselves are often already parallel, to me it seems that the main utility of this is to parallelize across the configure steps of all packages, which are always completely serial and therefore under-utilize the machine. I'm also interested, wanting to build a not-insignificant matrix of packages * profiles * build_type.
Hello,
Not opening a new issue, since I assume the best place for my bug is this issue.
Parallel conan install
's are always failing on an empty Conan package cache. Steps to reproduce:
rm -r ~/.conan2/p
conan install . --output-folder=build-a &
conan install . --output-folder=build-b &
I get errors like:
ERROR: Package 'zlib/1.3.1' not resolved: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497' already exists.
I've set up a small test case to illustrate. Here's the CI run.
I care about this because in my workflow I need to run four conan installs - one for each of Android's architectures. As a workaround, I run one of them, wait for it to finish and then run the three remaining in parallel. It just takes longer to complete. Since I have a workaround, this would be more of a performance improvement, than a correctness issue.
Thanks for the feedback @ViliusSutkus89, good in this thread, yes.
Another possible workaround is to do something like fetch first the recipe (something like one conan graph info ...
), which will be faster than having to wait for 1 configuration, then launch the different builds for the different architectures in parallel. As the race condition seems to be happening in the download of the zlib
recipe, this will likely work better and be faster.
Tried graph generation, It could work, but there's a caveat that different builds can't share any dependencies between them. Previously the error was :
ERROR: Package 'zlib/1.3.1' not resolved: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497' already exists
With the graph pre-generated and with the revisions known, the concurrent write error is triggered a bit later:
ERROR: Reference 'zlib/1.3.1#f52e03ae3d251dec704634230cd806a2%1708593606.497:b647c43bfefae3f830561ca202b6cfd935b56205#6b307bbcbae23635c4006543ffdbf3ef%1708593932.513' already exists
In my test case both builds use the same zlib, because both builds are actually from the same arch. But I've checked and this also happens with different profiles that share the same arch agnostic dependency.
I see, yes, the conan graph info
approach helps in the case of parallel building of the same graph, but there might still be issues with other different parallel jobs. If sharing the same cache, all the conan graph info
+ failed conan install
for all profiles should be launched first sequentially before launching the parallelism. But it seems this would result in a "dirtier" pipeline, so it sounds your original approach would be better here then.