instance_stixels icon indicating copy to clipboard operation
instance_stixels copied to clipboard

ROS node: GPU memory leak

Open tomsal opened this issue 4 years ago • 0 comments

We have recently discovered a GPU memory leak when using the ROS node. I haven't found the time yet to fix this.

Symptoms: GPU memory allocated by ROS node increases with the number of messages processed and crashes after a while (depending on GPU memory size) with a cudnn allocation error:

...
Error message:
[E] [TRT] ../rtSafe/safeContext.cpp (105) - Cudnn Error in initializeCommonContext: 4 (Could not initialize cudnn, please check cudnn installation.)
[E] [TRT] FAILED_ALLOCATION: std::exception

When playing 10 fps rosbags we experienced an increase in memory from ~1.3GB over 15 minutes.

Possible problem: I suspect that this is a problem that arises somewhere at the interface between NVIDIA TensorRT and Instance Stixels. We haven't experienced this with the PyTorch based Cityscapes evaluation script.

tomsal avatar Apr 08 '21 07:04 tomsal