nanonets_object_tracking icon indicating copy to clipboard operation
nanonets_object_tracking copied to clipboard

Video and detection does not match.

Open WilliamJuel opened this issue 5 years ago • 19 comments

It seems like the video linked to in the README.md (https://drive.google.com/open?id=1h2Wnb98tDVB6JlCDNQXCeZpG20x6AiZ2) does not match with the detections in Nanonets_object_tracking/det/.

In each of the det_*.txt files there are 1955 frames and the video consist of 2110 frames. This is also confirmed visually (Bounding boxes (detections) are not matching where the cars actually are) when using either the given model640.pt or a self-trained feature extractor on the given data and the program crashes when trying to process frame 1956 (for good reason).

Is there a new video or what is going on here?

WilliamJuel avatar Oct 03 '19 11:10 WilliamJuel

Same issue XD

johnnylord avatar Oct 04 '19 04:10 johnnylord

Same problem, is this model working?

mswiniars avatar Oct 22 '19 21:10 mswiniars

I ran a detector on vdo.avi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

yuntai avatar Nov 02 '19 08:11 yuntai

I ran a detector on vdo.vi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

Dear @yuntai , thank you for sharing. It is now working. Btw, how did you get the detection? I mean, can you be more specific?

I saw you detection result such as this first line: 1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1

Could you share what are all these number means?

Many thanks

Ujang24 avatar Dec 02 '19 10:12 Ujang24

Thanks @yuntai , that txt file works. and none of the other provided in the repo do.

aniketvartak avatar Dec 03 '19 15:12 aniketvartak

@yuntai Hey, can you tell me how you generated the detections text file from the detectron2? I know we can generate output videos or images with detectron2 but not sure that it can generate the detections text file? Any help would be appreciated, thank you!

anzy0621 avatar Feb 10 '20 03:02 anzy0621

1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1 -I think the first number represents the frame number and the instance detected in that frame, so when it's repeated it means the number of instances detected in that frame -The -1s are just there as the code selects [2:6] only of each line you could either keep them like that or modify the code to select [1:5] and remove the rest of the -1s. -The next 4 numbers represent x1, y1 ,w, h the bounding box of the detected object, so you can get the width and height from x2-x1 and y2-y1 which you can get from the bounding box info from let's say: outputs["instances"].pred_boxes that will give you the tensor and you can get the values from: outputs["instances"].pred_boxes[i].tensor[0, 0].data.cpu().numpy() (tensor[0, 0] for x1) You can find more about the data types from the detectron2 documentation: https://detectron2.readthedocs.io/tutorials/models.html#model-input-format -The last number (0.996153) i think represents the accuracy You can basically write the numbers in that format in a text file and give the detections and the input video to the deepsort tracker and it should work fine. :)

MinaAbdElMassih avatar Feb 17 '20 07:02 MinaAbdElMassih

1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1 -I think the first number represents the frame number and the instance detected in that frame, so when it's repeated it means the number of instances detected in that frame -The second number i think is a non existent class (let's say human) also i think every -1 represents a non existent class -The next 4 numbers represent x1, y1 ,w, h the bounding box of the detected object, so you can get the width and height from x2-x1 and y2-y1 which you can get from the bounding box info from let's say: outputs["instances"].pred_boxes that will give you the tensor and you can get the values from: outputs["instances"].pred_boxes[i].tensor[0, 0].data.cpu().numpy() (tensor[0, 0] for x1) You can find more about the data types from the detectron2 documentation: https://detectron2.readthedocs.io/tutorials/models.html#model-input-format -The last number (0.996153) i think represents the accuracy that it's the said class (let's say car) -The rest of the -1s represent non existent classes in the frame You can basically write the numbers in that format in a text file and give the detections and the input video to the deepsort tracker and it should work fine. :)

Hey, @MinaAbdElMassih thank you for the input! I appreciate it. I'm currently trying to output the text file. Have you tried doing this before?

anzy0621 avatar Feb 22 '20 21:02 anzy0621

@anzy0621 I haven't done this before, I managed to modify the code of the detectors of detectron2 API to write the detections info in that format in a .txt file and it worked, It's quiet simple once you manage to get the needed values. :)

MinaAbdElMassih avatar Feb 26 '20 11:02 MinaAbdElMassih

If I have only two classes to detect, how do the columns become?

AntonioMarsella avatar Mar 30 '20 17:03 AntonioMarsella

None of the models in the repository is working for me either. @yuntai # That worked for me. Thanks. Btw. good explanation @MinaAbdElMassih.

utkutpcgl avatar Apr 10 '20 12:04 utkutpcgl

1,-1,126.682,445.587,489.079,205.913,0.996153,-1,-1,-1 -I think the first number represents the frame number and the instance detected in that frame, so when it's repeated it means the number of instances detected in that frame -The second number i think is a non existent class (let's say human) also i think every -1 represents a non existent class -The next 4 numbers represent x1, y1 ,w, h the bounding box of the detected object, so you can get the width and height from x2-x1 and y2-y1 which you can get from the bounding box info from let's say: outputs["instances"].pred_boxes that will give you the tensor and you can get the values from: outputs["instances"].pred_boxes[i].tensor[0, 0].data.cpu().numpy() (tensor[0, 0] for x1) You can find more about the data types from the detectron2 documentation: https://detectron2.readthedocs.io/tutorials/models.html#model-input-format -The last number (0.996153) i think represents the accuracy that it's the said class (let's say car) -The rest of the -1s represent non existent classes in the frame You can basically write the numbers in that format in a text file and give the detections and the input video to the deepsort tracker and it should work fine. :)

Thanks mina🤗

MohamedMostafaSoliman avatar Apr 14 '20 19:04 MohamedMostafaSoliman

@AntonioMarsella I modified my answer above i think it better answers your question as the -1s don't represent classes.

MinaAbdElMassih avatar Apr 14 '20 19:04 MinaAbdElMassih

I ran a detector on vdo.vi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

The video has been removed from google drive. Can you share yours please

pooya-mohammadi avatar Mar 18 '21 10:03 pooya-mohammadi

I ran a detector on vdo.vi and dumped out detection result which matches the video clip. https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77 using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

The video has been removed from google drive. Can you share yours please

find . ... on my hd got me this one. can you check this is the correct one? https://drive.google.com/file/d/1PTBXBfCKuSCNk6wUGZ7pAQyj4rcKRkql/view?usp=sharing

yuntai avatar Jul 16 '21 16:07 yuntai

I ran a detector on vdo.avi and dumped out detection result which matches the video clip.

https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77

using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

found this thread continued only now, but my local git repo is still there! in demo/predictor.py in detectron2 repo.

def process_detected_instance(predictions, frame_no):                
    global outf                                                      
    boxes = predictions.pred_boxes.tensor.numpy()                    
    scores = predictions.scores.numpy()                              
    classes = predictions.pred_classes.numpy()                       
    mask = np.isin(classes, [0,1,2,3,5,7])                           
    boxes = boxes[mask]                                              
    scores = scores[mask]                                            
    classes = classes[mask]                                          
    if outf is None:                                                 
      outf = open('det.txt', 'w')                                    
                                                                     
                                                                     
    for i in range(len(classes)):                                    
      x1, y1, x2, y2 = list(boxes[i])                                
      w = x2 - x1                                                    
      h = y2 - y1                                                    
      assert w > 0 and h > 0                                         
      print(','.join(                                                
        list(map(str,[frame_no,-1])) +                               
        list(map(str, [x1, y1, w, h])) + [str(scores[i])] + ['-1']*3)
      , file=outf, flush=True)                                       
    print("frame_no({}) num({})".format(frame_no, len(classes)))     

and added process_detected_instance(predictions, frame_no) under elif "instance" in predictions:... futher down below

yuntai avatar Jul 16 '21 16:07 yuntai

I ran a detector on vdo.vi and dumped out detection result which matches the video clip. https://gist.github.com/yuntai/d0eb58b0eab620db65ac51e326be4c77 using detectron2 (COCO trained faster_rcnn_X_101_32x8d_FPN_3x) from https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md

The video has been removed from google drive. Can you share yours please

find . ... on my hd got me this one. can you check this is the correct one? https://drive.google.com/file/d/1PTBXBfCKuSCNk6wUGZ7pAQyj4rcKRkql/view?usp=sharing

Hi, could you please share your video again if you still have it because the link doesn't work anymore

dvrbanic avatar May 06 '22 09:05 dvrbanic

https://drive.google.com/file/d/1ADVZyR3BdWUm-saeM6GcFtbw6E2lUcKk/view?usp=sharing

yuntai avatar May 07 '22 08:05 yuntai

https://drive.google.com/file/d/1ADVZyR3BdWUm-saeM6GcFtbw6E2lUcKk/view?usp=sharing

Thanks a lot!

dvrbanic avatar May 08 '22 11:05 dvrbanic