PyTorch-YOLOv3 icon indicating copy to clipboard operation
PyTorch-YOLOv3 copied to clipboard

how to detect video

Open WANG-1173 opened this issue 4 years ago • 39 comments

If I want to test local video or webcam video, how should I modify it?

WANG-1173 avatar Apr 20 '20 05:04 WANG-1173

I have the same question! I have try to add the part of testing local video or webcam video code , but I meet some problems which I can't solve it.

PommesPeter avatar Apr 29 '20 06:04 PommesPeter

The problem is 'RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 3, 3, 3], but got 3-dimensional input of size [480, 640, 3] instead' when I try to transform the frame image to tensor, but it occurs the error above.

PommesPeter avatar Apr 29 '20 06:04 PommesPeter

I have the same question! I have try to add the part of testing local video or webcam video code , but I meet some problems which I can't solve it. I also tried to add the test code, but because the weight problem could not solve the problem of the test video, I finally replaced someone else's yolov3 GitHub

WANG-1173 avatar Apr 29 '20 06:04 WANG-1173

I think the way to modify the code is wrong, I also struggling for it.

I also find other yolov3 on Github, but it doesn't show in good quality.

PommesPeter avatar Apr 29 '20 06:04 PommesPeter

Could you tell me what yolov3 you are using? Thanks a lot

PommesPeter avatar Apr 29 '20 07:04 PommesPeter

Could you tell me what yolov3 you are using? Thanks a lot

git clone https://github.com/ultralytics/yolov3.git

WANG-1173 avatar Apr 29 '20 07:04 WANG-1173

I appreciate you can share this one. And I have a question, Is it you just replaced the file about video detect?

PommesPeter avatar Apr 29 '20 07:04 PommesPeter

I appreciate you can share this one. And I have a question, Is it you just replaced the file about video detect?

I think you may have misunderstood what I mean, because I couldn't test the video, so I directly changed yolov3 and used the code of the link blogger above, which contains the command to test the video directly.

WANG-1173 avatar Apr 29 '20 07:04 WANG-1173

alright. Anyway, Thank you very much

PommesPeter avatar Apr 29 '20 07:04 PommesPeter

from future import division

from models import * from utils.utils import * from utils.datasets import *

import os import sys import time import datetime import argparse import cv2

from PIL import Image

import torch from torch.utils.data import DataLoader from torchvision import datasets from torch.autograd import Variable

import matplotlib.pyplot as plt import matplotlib.patches as patches from matplotlib.ticker import NullLocator

if name == "main": parser = argparse.ArgumentParser() parser.add_argument("--image_folder", type=str, default="data/samples", help="path to dataset") parser.add_argument("--vedio_file", type=str, default="vedio_samples/2.mp4", help="path to dataset") parser.add_argument("--model_def", type=str, default="config/yolov3-tiny.cfg", help="path to model definition file") parser.add_argument("--weights_path", type=str, default="model_trained/100-epoch-air.pth", help="path to weights file") parser.add_argument("--class_path", type=str, default="data/air.names", help="path to class label file") parser.add_argument("--conf_thres", type=float, default=0.8, help="object confidence threshold") parser.add_argument("--nms_thres", type=float, default=0.4, help="iou thresshold for non-maximum suppression") parser.add_argument("--batch_size", type=int, default=1, help="size of the batches") parser.add_argument("--n_cpu", type=int, default=3, help="number of cpu threads to use during batch generation") parser.add_argument("--img_size", type=int, default=416, help="size of each image dimension") parser.add_argument("--checkpoint_model", type=str, help="path to checkpoint model") opt = parser.parse_args() print(opt) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = Darknet(opt.model_def, img_size=opt.img_size).to(device) if opt.weights_path.endswith(".weights"): # Load darknet weights model.load_darknet_weights(opt.weights_path) else: # Load checkpoint weights model.load_state_dict(torch.load(opt.weights_path)) model.cuda() model.eval() # Set in evaluation mode classes = load_classes(opt.class_path) Tensor = torch.cuda.FloatTensor if torch.cuda.is_available() else torch.FloatTensor if opt.vedio_file.endswith(".mp4"): cap = cv2.VideoCapture(opt.vedio_file) colors = np.random.randint(0, 255, size=(len(classes), 3), dtype="uint8") a=[] while cap.isOpened(): ret, img = cap.read() PILimg = np.array(Image.fromarray(cv2.cvtColor(img,cv2.COLOR_BGR2RGB))) imgTensor = transforms.ToTensor()(PILimg) imgTensor, _ = pad_to_square(imgTensor, 0) imgTensor = resize(imgTensor, 416) imgTensor = imgTensor.unsqueeze(0) imgTensor = Variable(imgTensor.type(Tensor))

    with torch.no_grad():
        detections = model(imgTensor)
        detections = non_max_suppression(detections, opt.conf_thres, opt.nms_thres)

    a.clear()
    if detections is not None:
        a.extend(detections)
    b=len(a)
    if len(a)  :
        for detections in a:
            if detections is not None:
                detections = rescale_boxes(detections, opt.img_size, PILimg.shape[:2])
                unique_labels = detections[:, -1].cpu().unique()
                n_cls_preds = len(unique_labels)
                for x1, y1, x2, y2, conf, cls_conf, cls_pred in detections:
                    box_w = x2 - x1
                    box_h = y2 - y1
                    color = [int(c) for c in colors[int(cls_pred)]]
                    print(cls_conf)
                    img = cv2.rectangle(img, (x1, y1 + box_h), (x2, y1), color, 2)
                    cv2.putText(img, classes[int(cls_pred)], (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
                    cv2.putText(img, str("%.2f" % float(conf)), (x2, y2 - box_h), cv2.FONT_HERSHEY_SIMPLEX, 0.5,
                                color, 2)

        print()
        print()
    #cv2.putText(img,"Hello World!",(400,50),cv2.FONT_HERSHEY_PLAIN,2.0,(0,0,255),2)
    cv2.imshow('frame', img)
    #cv2.waitKey(0)

    if cv2.waitKey(25) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

I write a simple vedio base on this

Guardian-Li avatar Apr 29 '20 11:04 Guardian-Li

@Guardian-Li It looks like to be similar to my code, but how do you solve the problem called: RuntimeError: Expected 4-dimensional input for 4-dimensional weight [32, 3, 3, 3], but got 3-dimensional input of size [480, 640, 3] instead?

PommesPeter avatar Apr 29 '20 12:04 PommesPeter

@Guardian-Li almost the same code the difference is that I haven't resized the frame from the camera

PommesPeter avatar Apr 29 '20 12:04 PommesPeter

@Guardian-Li Is the frame must be resized?

PommesPeter avatar Apr 29 '20 12:04 PommesPeter

the Tensor you change from image must be use imgTensor = imgTensor.unsqueeze(0)

Guardian-Li avatar Apr 29 '20 12:04 Guardian-Li

you should add one dimension

Guardian-Li avatar Apr 29 '20 12:04 Guardian-Li

and cv2.imread is BRG ,PIL image read is RGB .

Guardian-Li avatar Apr 29 '20 12:04 Guardian-Li

你那个张量最后加到model里面的之前给他加一个维度就行了

Guardian-Li avatar Apr 29 '20 12:04 Guardian-Li

@Guardian-Li 原来是中国人😂 好的 谢谢你

PommesPeter avatar Apr 29 '20 12:04 PommesPeter

我看你简介的 没事 能跑就行

Guardian-Li avatar Apr 29 '20 12:04 Guardian-Li

嗯 我去改改我的代码

PommesPeter avatar Apr 29 '20 12:04 PommesPeter

@Guardian-Li 可以了 可以使用视频检测了 感谢帮助!

PommesPeter avatar Apr 29 '20 13:04 PommesPeter

没事

Guardian-Li avatar Apr 29 '20 13:04 Guardian-Li

@Guardian-Li @PommesPeter Did you solve it? I need the detect.py file for this repo to detect video.

impravin22 avatar May 05 '20 04:05 impravin22

@Guardian-Li @PommesPeter is there a working detect.py that works for video? The python code posted above by @Guardian-Li, is that working code?

aditjha avatar May 10 '20 10:05 aditjha

@aditjha it works, I follow his/her code and my understanding of the code, and i make it.

PommesPeter avatar May 10 '20 10:05 PommesPeter

@impravin22 Yes, I solve this problem. I follow his/her(@Guardian-Li ) code and I make it

PommesPeter avatar May 10 '20 10:05 PommesPeter

@impravin22 it's a working code

PommesPeter avatar May 10 '20 10:05 PommesPeter

@PommesPeter Great. Can you please send your detect.py if you dont mind.

Thanks.

impravin22 avatar May 10 '20 10:05 impravin22

@PommesPeter thank you for replying! Also, I am new to all this...using yolo for a project...so this detect.py allows for a recorded video to be inferenced correct?

aditjha avatar May 10 '20 10:05 aditjha

@PommesPeter Yeah I found it. It is as video.py in @Guardian-Li repo, isn't it?

Thank you very much

impravin22 avatar May 10 '20 10:05 impravin22