text-generation-webui
text-generation-webui copied to clipboard
Start up mem
Describe the bug
On start up the porgram can use up to 64 GB of RAM. --cpu-memory dose not fix this.
Is there an existing issue for this?
- [X] I have searched the existing issues
Reproduction
Run a module like gpt-j-6B and watch task manager
Screenshot
No response
Logs
NA
System Info
RTX 3060TI
i711700
64BG ram
How are you loading a model at startup? What are your launch params?
I mean when the program starts up not windows start up.
That's what I'm asking about. How are you loading the model, as in what settings do you have?
python server.py --chat --model-menu --gpu-memory 6 --cpu-memory 32
Since you do not specify --wbits and --groupsize it's not a GPTQ model(or won't be loaded anyway) Then you might be loading a normal model at 16bit, this could very likely overflow your set gpu memory and rest is loaded onto RAM. Other possiblity is that you are loading a GGML model that will be loaded onto RAM and is meant to use CPU only.
What is the model you are trying to load? GGML models usually have that in their name.
GPTj-6b
You know you could give a bit more information. What is the exact filename? I need to know what type of model you are actually trying to load.