Support concurrent access to the oras cache
What is the version of your ORAS CLI
1.2.0
What would you like to be added?
Please add locks to the local cache when using ORAS_CACHE environment variable.
Why is this needed for ORAS?
When there are multiple oras processes running at the same time and they share the same cache location, I sometimes get errors when two processes access the same file (for example a blob) simultaneously. We use process level parallization in our build pipeline to increase upload and download performance. Typically we download around 50 oci artifacts with often only 1-2 layers in quick succession. Without parallelization, oras cannot saturate the network bandwith. The same is true for uploading. We typically create around 10 distinct oci artifacts. Since the compression of folders is only single threaded (from my tests), we want to run as many compressions in parallel to fully utilze all CPU cores.
Implementation brainstorming
I had a quick look at the code but got lost pretty quickly because I'm not a Go developer. I guess a proper implementation would be to extend the oci-layout implementation with a locking mechanism. From my understanding the cache is just an internal oci-layout and oras does a copy to cache and then follows with a copy from cache to destination.
Are you willing to submit PRs to contribute to this feature?
- [ ] Yes, I am willing to implement it.
Is there an error message you could provide and what environment are you using with the CLI
Could you also provide any information about your platform like OS?
I'm using windows and this is the error I got (this was with oras version 1.2.0 I believe):
Might need something like flock to resolve this. Similar issue on Helm https://github.com/helm/helm/pull/31128
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
We have turned off the ORAS_CACHE completely in our pipelines which is a shame, because there is a lot to be gained from not downloading the same files every build.
I'm not sure on the performance cost of using flock if that was the solution