CUDA Out Of Memory issue
As far as I understand it, and during some testing I kept on getting Cuda OOM errors while running code with pyinstrument where multiple models were run one after another.
While making sure there was no reference kept to the tensors in the python code, I kept on getting CUDA OOM errors when using pyinstrument. But once disabled the errors disappeared and my VRAM reset as expected after each reference was deleted.
Is there an option to ensure pyinstrument clears its references to onnx and torch tensors, especially after calling del tensor.
As I'd like to keep using pyinstrument but it's not feasible atm.
- Emil
I have a similar problem where a relatively heavy object is not garbage collected when I leave the context, even with del (python 3.12, interval = 0.1). The growth shows rather starkly on tracemalloc, with the number of objects growing by exactly the number of instantiations (or a multiple of). This results in an OOM of the whole process after a few minutes. Such behavior only occurs when using pyinstrument, the RAM usage staying stable with any other profiler. I have been using pyinstrument for years and I don't recall such a problem before (perhaps with changing from 3.7 to 3.12?). Might be related to #296.
I'm encountering a similar problem. I tracked it down to calls to output_html.
profiler.stop()
profiler.output_html()
profiler.reset()
Using 4.6.2, memory usage (max RSS) climbs ~2MB over 100 profiling sessions. Using 5.0.0, memory usage climbs ~40MB for the same number of sessions.
If I comment out the call to output_html, the memory stays steady
As far as I understand it, and during some testing I kept on getting Cuda OOM errors while running code with pyinstrument where multiple models were run one after another. While making sure there was no reference kept to the tensors in the python code, I kept on getting CUDA OOM errors when using
pyinstrument. But once disabled the errors disappeared and my VRAM reset as expected after each reference was deleted.Is there an option to ensure pyinstrument clears its references to onnx and torch tensors, especially after calling
del tensor. As I'd like to keep usingpyinstrumentbut it's not feasible atm.
- Emil
I am facing the same question. The code uses torch gpu runs well with python, but encounters torch.OutOfMemoryError: CUDA out of memory. when starts with pyinstrument.