FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Occupied GPU memory keeps growing with more talks

Open Dtristone opened this issue 2 years ago • 4 comments

Is there any way to reduce the occupied memory as I clean current history?

Dtristone avatar Apr 05 '23 13:04 Dtristone

This is probably because the code keeps track of the outputs from the model. Try changing the code to skip this.

rahulvigneswaran avatar Apr 06 '23 11:04 rahulvigneswaran

I suppose you are using the web interface? This shouldn't happen when you clear the history. Could you provide more details?

zhisbug avatar Apr 07 '23 21:04 zhisbug

I suppose you are using the web interface? This shouldn't happen when you clear the history. Could you provide more details?

Yes, I am using the web interface. Now the occupied memory is stable at 31.5GB and will not continue growing. Althought the memory still will not reduce when I clean the history, I think it's not a problem now. Maybe this issue can be closed.

Dtristone avatar Apr 08 '23 05:04 Dtristone

This is still a problem in other configurations. Like in Windows using the cli interface (where you cannot clear conversation), if you have a 16GB GPU and run 13B with load 8 bit (to reduce size), it will work for a bit until it reaches that conversation limit and breaks

This is compounded by the fact that the web interface doesn't work in windows, so this is all we have at the moment.

I think it would be ideal if somehow a cap on the memory usage is kept so it never goes over what's allowed (other ones do this). In this case, erasing previous conversation to allow memory to not go over is a valid thing, since otherwise it would break the program.

MolotovCherry avatar Apr 08 '23 06:04 MolotovCherry

Is this solved yet? This is also a problem when using the OpenAI API in Linux environments too @zhisbug @merrymercy

Abhijit-2592 avatar Oct 04 '23 15:10 Abhijit-2592

@Abhijit-2592 this is not an issue, really. As the chat gets longer, it takes more memory, which is freed when you clean the conversation, no?

surak avatar Oct 21 '23 15:10 surak

this is expected behavior as @surak explained. we've been serving models for long time on chat.lmsys.org and did not find issue. however, if you find evidence for memory leak. let us know.

infwinston avatar Oct 21 '23 15:10 infwinston