text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

Start up mem

Open BennettFourr opened this issue 1 year ago • 5 comments

Describe the bug

On start up the porgram can use up to 64 GB of RAM. --cpu-memory dose not fix this.

Is there an existing issue for this?

  • [X] I have searched the existing issues

Reproduction

Run a module like gpt-j-6B and watch task manager

Screenshot

No response

Logs

NA

System Info

RTX 3060TI
i711700
64BG ram

BennettFourr avatar May 02 '23 22:05 BennettFourr

How are you loading a model at startup? What are your launch params?

LaaZa avatar May 02 '23 22:05 LaaZa

I mean when the program starts up not windows start up.

BennettFourr avatar May 03 '23 11:05 BennettFourr

That's what I'm asking about. How are you loading the model, as in what settings do you have?

LaaZa avatar May 03 '23 11:05 LaaZa

python server.py --chat --model-menu --gpu-memory 6 --cpu-memory 32

BennettFourr avatar May 03 '23 11:05 BennettFourr

Since you do not specify --wbits and --groupsize it's not a GPTQ model(or won't be loaded anyway) Then you might be loading a normal model at 16bit, this could very likely overflow your set gpu memory and rest is loaded onto RAM. Other possiblity is that you are loading a GGML model that will be loaded onto RAM and is meant to use CPU only.

What is the model you are trying to load? GGML models usually have that in their name.

LaaZa avatar May 03 '23 12:05 LaaZa

GPTj-6b

BennettFourr avatar May 03 '23 21:05 BennettFourr

You know you could give a bit more information. What is the exact filename? I need to know what type of model you are actually trying to load.

LaaZa avatar May 03 '23 22:05 LaaZa