yolov9
yolov9 copied to clipboard
yolov9e model allocates all available memory and fails
Hy.
When i try to train a yolov9e model the program terminates, because of a leak of CUDA memory. It happens either directly when the first epoch starts.
I use 2 RTX2080 Ti: Ultralytics YOLOv8.1.23 🚀 Python-3.10.12 torch-2.1.2+cu121 CUDA:0 (NVIDIA GeForce RTX 2080 Ti, 11009MiB) CUDA:1 (NVIDIA GeForce RTX 2080 Ti, 11012MiB)
So i use 2 GPU's, a batch size of 16, an imgsz of 640. Before we trained yolo8x models with a dataset of 500k images without any problems.
The traceback:
Traceback (most recent call last):
File "/home/rrt/.config/Ultralytics/DDP/_temp_g6fcpx_t139978855048832.py", line 12, in
Maybe also important. When i use the yolov9c model everything works with ~50% gpu memory usage.