LocalAI LocalAI use only 100% CPU (one Core) for Rebuild and Could not release VRAM when not use model

Is your feature request related to a problem? Please describe.

It could be Problem. I used docker desktop and when rebuild Local AI, LocalAI use only one cpu 100% while my computer has 40 VCPU. It takes long time to finish rebuilding, almost 8 hours.

When loading model for using, If I want release vram to load other model, I have to restart server. Describe the solution you'd like

Solution could me:

Local AI should user more CPU to finish rebuilding earlier.
Command looks like curl .... to release model and vram.

Describe alternatives you've considered

Parameter to use more CPU
Command to release Model and Vram Additional context

Thank you for a wonderful solution right now.

Jul 28 '24 08:07 noblerboy2004

If you're using the docker container, set the environment variable BUILD_PARALLELISM to specify the number of CPU cores to use.

Aug 04 '24 17:08 aarseneau-idexx

BUILD_PARALLELISM

Thank you for your reply. However, i could not search out this parameter "BUILD_PARALLELISM " Could you please show the syntax of this config? BUILD_PARALLELISM=true or BUILD_PARALLELISM=30??

Aug 06 '24 13:08 noblerboy2004

You can see the code here: https://github.com/mudler/LocalAI/blob/1c708d21de87371bb17c27e2615aa352e9ac5790/entrypoint.sh#L18

It's the number of threads you want to use when building/rebuilding. You need to pass it as an environment variables when creating the container.

For instance if your computer has 40 CPU cores and you want to use all of them during the rebuild it would look like this: BUILD_PALALLELISM=40

Aug 08 '24 04:08 aarseneau-idexx

You can see the code here:

https://github.com/mudler/LocalAI/blob/1c708d21de87371bb17c27e2615aa352e9ac5790/entrypoint.sh#L18

It's the number of threads you want to use when building/rebuilding. You need to pass it as an environment variables when creating the container.

For instance if your computer has 40 CPU cores and you want to use all of them during the rebuild it would look like this: BUILD_PALALLELISM=40

Thank you for your help.

Aug 08 '24 12:08 noblerboy2004

You can see the code here:

https://github.com/mudler/LocalAI/blob/1c708d21de87371bb17c27e2615aa352e9ac5790/entrypoint.sh#L18

It's the number of threads you want to use when building/rebuilding. You need to pass it as an environment variables when creating the container.

For instance if your computer has 40 CPU cores and you want to use all of them during the rebuild it would look like this: BUILD_PALALLELISM=40

Hi,

I tried this option but still the same. only one CPU used by localai

Aug 13 '24 14:08 noblerboy2004

I tried this option but still the same. only one CPU used by localai

I experienced the same -- I went ahead and edited BUILD_PARALLELISM=20into the .env file, then double checked that that was in fact the right instance of the file, and this cleared up the problem.

(This wasn't the first instance I've observed here where some of my attempts to set environment vars from outside the container aren't making it all the way inside. I haven't done much to root-cause that since it's easy enough to work around.)

Aug 31 '24 21:08 reedtaylor

I tried this option but still the same. only one CPU used by localai

I experienced the same -- I went ahead and edited BUILD_PARALLELISM=20into the .env file, then double checked that that was in fact the right instance of the file, and this cleared up the problem.

(This wasn't the first instance I've observed here where some of my attempts to set environment vars from outside the container aren't making it all the way inside. I haven't done much to root-cause that since it's easy enough to work around.)

Thanks reedtaylor. Mistyping letter R by L lead to not-working config. Now, it's working well.

Sep 01 '24 07:09 noblerboy2004