FAST-Pathology Memory leak during inference

Memory leak during inference

Open andreped opened this issue 2 years ago • 0 comments

Memory does not seem to be freed after inference, at least not properly. This was observed using an ONNX model using both TensorRT and OpenVINO (CPU) inference engines. This also affects if you are running a model in batch mode (across multiple WSIs). The memory keeps increasing for every new WSI and eventually OOM occurs.

For TensorFlow, this has been a popular topic for quite some time. This happens as the session, where all inference and model graphs and created and live, is set globally for the entire process. A workaround in Python is therefore to perform all TensorFlow stuff in a child process (using multiprocessing), and then kill the child process after inference, which keeps the main process clean.

However, creating processed in C++ for this purpose is not viable. It is also surprising that the same (or at least a similar) memory leakage issue was observed using onnx runtime. Maybe there is something that is not freed in FAST? Not sure.

Jan 07 '22 06:01 andreped

FAST-Pathology FAST-Pathology copied to clipboard

Memory leak during inference

FAST-Pathology
FAST-Pathology copied to clipboard