RedShiva comments

Repositories
Issues
Comments

Results 2 comments of


                                            RedShiva

Models are failing to be properly unloaded and freeing up VRAM

I am running llama_model._model.__del__() per the above comment, and I am still seeing the process use cuda ram. Has there been any movement on creating a proper close method?

Models are failing to be properly unloaded and freeing up VRAM

Thank you!!!