FastChat Occupied GPU memory keeps growing with more talks

Occupied GPU memory keeps growing with more talks

Open Dtristone opened this issue 2 years ago • 4 comments

Is there any way to reduce the occupied memory as I clean current history?

Apr 05 '23 13:04 Dtristone

This is probably because the code keeps track of the outputs from the model. Try changing the code to skip this.

Apr 06 '23 11:04 rahulvigneswaran

I suppose you are using the web interface? This shouldn't happen when you clear the history. Could you provide more details?

Apr 07 '23 21:04 zhisbug

I suppose you are using the web interface? This shouldn't happen when you clear the history. Could you provide more details?

Yes, I am using the web interface. Now the occupied memory is stable at 31.5GB and will not continue growing. Althought the memory still will not reduce when I clean the history, I think it's not a problem now. Maybe this issue can be closed.

Apr 08 '23 05:04 Dtristone

This is still a problem in other configurations. Like in Windows using the cli interface (where you cannot clear conversation), if you have a 16GB GPU and run 13B with load 8 bit (to reduce size), it will work for a bit until it reaches that conversation limit and breaks

This is compounded by the fact that the web interface doesn't work in windows, so this is all we have at the moment.

I think it would be ideal if somehow a cap on the memory usage is kept so it never goes over what's allowed (other ones do this). In this case, erasing previous conversation to allow memory to not go over is a valid thing, since otherwise it would break the program.

Apr 08 '23 06:04 MolotovCherry

Is this solved yet? This is also a problem when using the OpenAI API in Linux environments too @zhisbug @merrymercy

Oct 04 '23 15:10 Abhijit-2592

@Abhijit-2592 this is not an issue, really. As the chat gets longer, it takes more memory, which is freed when you clean the conversation, no?

Oct 21 '23 15:10 surak

this is expected behavior as @surak explained. we've been serving models for long time on chat.lmsys.org and did not find issue. however, if you find evidence for memory leak. let us know.

Oct 21 '23 15:10 infwinston

FastChat FastChat copied to clipboard

Occupied GPU memory keeps growing with more talks

FastChat
FastChat copied to clipboard