LocalAI use only 100% CPU (one Core) for Rebuild and Could not release VRAM when not use model
Is your feature request related to a problem? Please describe.
It could be Problem. I used docker desktop and when rebuild Local AI, LocalAI use only one cpu 100% while my computer has 40 VCPU. It takes long time to finish rebuilding, almost 8 hours.
When loading model for using, If I want release vram to load other model, I have to restart server. Describe the solution you'd like
Solution could me:
- Local AI should user more CPU to finish rebuilding earlier.
- Command looks like curl .... to release model and vram.
Describe alternatives you've considered
- Parameter to use more CPU
- Command to release Model and Vram Additional context
Thank you for a wonderful solution right now.
If you're using the docker container, set the environment variable BUILD_PARALLELISM to specify the number of CPU cores to use.
BUILD_PARALLELISM
Thank you for your reply. However, i could not search out this parameter "BUILD_PARALLELISM " Could you please show the syntax of this config? BUILD_PARALLELISM=true or BUILD_PARALLELISM=30??
You can see the code here: https://github.com/mudler/LocalAI/blob/1c708d21de87371bb17c27e2615aa352e9ac5790/entrypoint.sh#L18
It's the number of threads you want to use when building/rebuilding. You need to pass it as an environment variables when creating the container.
For instance if your computer has 40 CPU cores and you want to use all of them during the rebuild it would look like this:
BUILD_PALALLELISM=40
You can see the code here:
https://github.com/mudler/LocalAI/blob/1c708d21de87371bb17c27e2615aa352e9ac5790/entrypoint.sh#L18
It's the number of threads you want to use when building/rebuilding. You need to pass it as an environment variables when creating the container.
For instance if your computer has 40 CPU cores and you want to use all of them during the rebuild it would look like this:
BUILD_PALALLELISM=40
Thank you for your help.
You can see the code here:
https://github.com/mudler/LocalAI/blob/1c708d21de87371bb17c27e2615aa352e9ac5790/entrypoint.sh#L18
It's the number of threads you want to use when building/rebuilding. You need to pass it as an environment variables when creating the container.
For instance if your computer has 40 CPU cores and you want to use all of them during the rebuild it would look like this:
BUILD_PALALLELISM=40
Hi,
I tried this option but still the same. only one CPU used by localai
I tried this option but still the same. only one CPU used by localai
I experienced the same -- I went ahead and edited BUILD_PARALLELISM=20into the .env file, then double checked that that was in fact the right instance of the file, and this cleared up the problem.
(This wasn't the first instance I've observed here where some of my attempts to set environment vars from outside the container aren't making it all the way inside. I haven't done much to root-cause that since it's easy enough to work around.)
I tried this option but still the same. only one CPU used by localai
I experienced the same -- I went ahead and edited
BUILD_PARALLELISM=20into the .env file, then double checked that that was in fact the right instance of the file, and this cleared up the problem.(This wasn't the first instance I've observed here where some of my attempts to set environment vars from outside the container aren't making it all the way inside. I haven't done much to root-cause that since it's easy enough to work around.)
Thanks reedtaylor. Mistyping letter R by L lead to not-working config. Now, it's working well.