OpenCVForUnity
OpenCVForUnity copied to clipboard
How to make opencvfor unity YOLO sample faster on mobile phones
Respected sir, I have purchased this asset but its YOLO sample is very slow on mobile devices. Is there any way to run it on Mobile phone GPU ?
Thank you
Hi, I used Roboflow to create my YOLO Darknet dataset from my annotated images and then trained models using Yolo v4 Tiny. Anything besides the Yolo v4 Tiny algorithm would produce very low FPS on mobile based on my testing. Yolo v4 Tiny produces about 20 fps on an iPhone 12 and about 15 FPS on a middle tier Android, which is a little laggy but completely feasible compared to other models.
https://blog.roboflow.com/train-yolov4-tiny-on-custom-data-lighting-fast-detection/
I'd eventually like to get Yolov5 working in OpenCV, but haven't seen a way to convert the PyTorch models to weights that are usable in this asset.
What phone are you using? Using a Pixel 4, I only get about 5 FPS, when using Tiny-Yolov4. Are you optimizing your code in some significant way?
i tested samsung s10 and s20 but in both i achieved only 10 fps and mobile version misdetecting the object and do not detect from far distance but desktop version can detect from far distance and its accuracy is also good.
The reason for this is because OpenCV for Unity was copied from OpenCV for Java, which doesn't support gpu rendering. It only uses cpu. This means that mobile apps created from this asset that leverage the dnn module will be very slow because they are limited to only using cpu. I honestly consider this false advertising by the asset creators and am personally very frustrated by this.
You need to use suitable model for mobile devices - I don't think GPU is actually much help here for this kind of thing where as an optimized for mobile model is - e.g YoloX - Nano, i've found you need well trained custom models with reduced classes to get decent mobile performance, then the model needs optimizing e.g. quantized. Free off the shelf models do not fit this profile IMHO. You also need to make sure the code is very memory efficient/parallelized, run detections on a crop of the image rather than the whole image etc.
is there any solution? It's just 3-5FPS for GooglePixel5