benchmark icon indicating copy to clipboard operation
benchmark copied to clipboard

RuntimeError When Enabling Accuracy Checks in yolov3 Training on GPU.

Open scshtyk opened this issue 9 months ago • 0 comments

Issue Description I encounter a RuntimeError related to gradient computation when enabling accuracy checks during the training of yolov3 in a GPU docker environment. The training runs without issues when the --accuracy flag is not used.

Steps to Reproduce python install.py yolov3 python run.py yolov3 -d cuda -t train --accuracy

Expected Behavior The training process should run without errors and perform accuracy checks without causing runtime errors.

Actual Behavior The script executes successfully without the --accuracy flag. However, when the accuracy check is enabled, it fails with the following error message:

TypeError: Darknet.forward() takes from 2 to 4 positional arguments but 6 were given Running train method from yolov3 on cuda in eager mode with input batch size 4 and precision fp32.

env:pytorch-cuda=12.1 python=3.11

scshtyk avatar Apr 26 '24 10:04 scshtyk