RedShiva

Results 2 comments of RedShiva

I am running llama_model._model.__del__() per the above comment, and I am still seeing the process use cuda ram. Has there been any movement on creating a proper close method?