LocalAI
LocalAI copied to clipboard
feat: migrate python backends from conda to uv
Description
Doing some work to try and speed up builds by migrating away from huge conda environments to a smaller venv setup and using UV instead of pip for installation speed.
This is a draft until I get the backends that use common-env/transformers migrated. That is when we will know if it's actually worth it to make the switch.
Notes for Reviewers
There is also a new feature for the Dockerfile that I added to allow me to build with only one/some of the "extras" backends to speed up testing cycle time. By default, if EXTRA_BACKENDS is not set to anything, it will build all of the backends just as before.
The logic behind the new feature is that if you set IMAGE_TYPE=extras on it's own, you get all of the backends. If you set IMAGE_TYPE=extras and you set EXTRA_BACKENDS to a string, such as EXTRA_BACKENDS="diffusers,bark" then it will build with only diffusers and bark.
If you do not set IMAGE_TYPE=extras, then you will get no extra backends, no matter what EXTRA_BACKENDS is set to.
Signed commits
- [x] Yes, I signed my commits.
Deploy Preview for localai canceled.
| Name | Link |
|---|---|
| Latest commit | 19790641191a29da26229239d50b6c7aa72ca231 |
| Latest deploy log | https://app.netlify.com/sites/localai/deploys/663ce7378d9ea500089eb039 |
Some benchmarks from before and after this change:
Times and sizes are compared between my desktop and the docker registry in my lab with a 2.5GbE network link between them Frequently I was I/O bound rather than network bound. Push and Pull times are for the entire process, including compression/decompression
v2.14.0-cublas-cuda12-ffmpeg
- 46.2GB uncompressed size
- 22.57GB compress
- 9.68GB max layer size pushed
- 5.93GB max layer size pulled
- 7m22s to push
- 10m47s to pull
new (same config as image above) image:
- 35.67GB uncompressed
- 16.41GB compress
- 8.9GB max layer size pushed
- 4.71GB max layer size pulled
- 7m14s to push
- 10m16s to pull
So while the images are smaller, it does not save much time on the push/pull. What it does do is HEAVILY reduce the amount of time needed to build the images. For example the hipblas-extras image that builds as part of the PR tests takes ~55m to 1 hour before this change and with this change it takes about 30 minutes.
Just making sure, this is hitting master soon? @cryptk / @golgeek (I have downstream things that will need to be changed if so)
Just making sure, this is hitting master soon? @cryptk / @golgeek (I have downstream things that will need to be changed if so)
I think that Mudler was talking about bringing it in shortly after the next release, and 2.15 is building now, but your downstream things shouldn't break when this is merged, you will just be able to do those downstream things a bit faster
Just making sure, this is hitting master soon? @cryptk / @golgeek (I have downstream things that will need to be changed if so)
I think that Mudler was talking about bringing it in shortly after the next release, and 2.15 is building now, but your downstream things shouldn't break when this is merged, you will just be able to do those downstream things a bit faster
noted thank you sir!
@cryptk great work as always! I'd say it's safe to merge now as 2.15.0 is out, time to test this on master.
Hi @cryptk, this PR broke OpenVINO support since optimum is not installed. Opening an issue to track, currently I cannot work on this since I'm not at home. Hope to work on it tomorrow.