Works perfectly on image input but inaccurate on video input

Open ghost opened this issue 3 years ago • 1 comments

Describe the bug Very accurate on input images but not so much on video inputs

Code and Data

Code is identical to the Detecto Colab demo

Environment: Windows 7 but I'm on Colab torch.version = 1.11.0+cu113 torchvision.version = 0.12.0+cu113

Additional context It's kind of weird since I've tried several Coca-Cola images aside from what I uploaded and it was perfect. It just doesn't recognize it when the input is a video file.

Is there something wrong with the video input method?

Apr 29 '22 04:04 ghost

There shouldn't be much of a difference between image vs. video inference, the underlying model is still the same. Maybe try diversifying your set of training images to include more far away or varied shots? From the image above it seems like those images are cleaner (less background, no shadows, etc.) than the video footage, which could lead to the issue you're having.

Apr 30 '22 15:04 alankbi