YoloDotNet icon indicating copy to clipboard operation
YoloDotNet copied to clipboard

GPU Performance Worse than CPU Performance

Open toita86 opened this issue 7 months ago • 11 comments

Hello,

I am currently using the YoloDotNet NuGet package to test the performance of YOLO models, I'm doing this testing for my degree thesis. However, I have encountered an issue where the GPU performance is significantly worse than the CPU performance.

Environment:

YoloDotNet version: v2.0
GPU: 4090
CUDA/cuDNN version: cuda 11.8 and cudnn 8.9.7
.NET version: 8

Steps to Reproduce:

var sw = new Stopwatch();
for (var i = 0; i < 500; i++)
{
    var file = $@"C:\Users\Utente\Documents\assets\images\input\frame_{i}.jpg";

    using var image = SKImage.FromEncodedData(file);
    sw.Restart();
    var results = yolo.RunObjectDetection(image, confidence: 0.25, iou: 0.7);
    sw.Stop();
    image.Draw(results);

    image.Save(file.Replace("input", $"output_{yolo_version}{version}_{target}").Replace(".jpg", $"_detect_{yolo_version}{version}_{target}.jpg"),
        SKEncodedImageFormat.Jpeg);
    times.Add(sw.Elapsed.TotalMilliseconds);
    Console.WriteLine($"Time taken for image {i}: {sw.Elapsed.TotalMilliseconds:F2} ms");

This is the way I'm taking the time measure for the detections

Expected Behavior is that the inference using the GPU should be faster than inference using the CPU. But the performance are not improving using the GPU.

To load the model i use this setup in the GPU case

yolo = new Yolo(new YoloOptions
{
    OnnxModel = @$"C:\Users\Utente\Documents\assets\model\yolov{yolo_version}{version}_{target}.onnx",
    ModelType = ModelType.ObjectDetection,  // Model type
    Cuda = true,                           // Use CPU or CUDA for GPU accelerated inference. Default = true
    GpuId = 0,                               // Select Gpu by id. Default = 0
    PrimeGpu = true,                       // Pre-allocate GPU before first. Default = false
});
Console.WriteLine(yolo.OnnxModel.ModelType);
Console.WriteLine($"Using GPU for version {yolo_version}{version}");

Performance Metrics:

GPU Inference Time: Total time taken for version m: 21124.52 ms Average time per image for version m: 42.25 ms

CPU Inference Time: Total time taken for version m: 18869.73 ms Average time per image for version m: 37.74 ms

@NickSwardh I would appreciate any assistance or guidance in resolving this issue. Please let me know if you need any further information.

Thank you.

toita86 avatar Jul 24 '24 17:07 toita86