envd
                                
                                 envd copied to clipboard
                                
                                    envd copied to clipboard
                            
                            
                            
                        feat(cache): separate python installation from base image by adding pre-built remote cache
Description
For current LLB compilation (show in below figure), we pull the base(custom) image in first layer, which means if the user change the base image (using different cuda version / different os, etc.), all caches from previous built will miss.
 Under my network condition, the user group creation & python installation with conda took around 1 minute. I think such operation can be sped up by leveraging the pre-built remote cache for different python version from a fixed image. Demonstrate in the figure below, whenever the user is changing the base image, we can simply pull the
Under my network condition, the user group creation & python installation with conda took around 1 minute. I think such operation can be sped up by leveraging the pre-built remote cache for different python version from a fixed image. Demonstrate in the figure below, whenever the user is changing the base image, we can simply pull the llb.Diff(fixStage, pythonStage) caches and perform llb.Merge() between it and the base image.

Potential problems
- Not sure whether buildkit support output the llb.Diff()layer, but if not, can walk around by caching pythonStage and do thellb.Diff()manually.
- The above method modify etc/passwd&etc/usergroupwhen creating user group, when merging with base image, if conflict exist in such files between different os, there might be problems.
Other thoughts
If outputting llb.Diff() layer is possible, we might be able to pre-built caches for large package like pytorch, cuda-related components and use them as plug-in for base image. Since package downloading take significant time when building up docker environment, this should speed up the build process a lot.
Message from the maintainers:
Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.
Thanks for your contribution! I think the core problem here is at buildkit side, how we can inspect the llb.Diff node, and whether it's possible to export it separately. Can you raise the question at the buildkit repo and link it here also? Thanks!
- LLB Merge could be problematic when there are some overlapped directories.
- Maintaining remote cache for different Python versions need also consider the security update.
You need to check the v1 graph. It should support Python w/wo Conda/Mamba.
Thanks for the proposal!
We can optimize the workflow further. For example, we can investigate if we could merge the pytorch/tensorflow package into the environment image directly, instead of downloading and installing it from pypi.
The tf/torch packages are too large. it may be faster to keep a remote cache for them.
And also starship package, it used github domain to host packages, which is hard to install when network issue exists when we don't have cache
Yep. starship. It is hard to install here in CN.