VNext copied to clipboard
How to reproduce visualization results in README?
Thanks for your wonderful work!
I'd like to reproduce the visualization results in your README.
I tried to add following 2 lines before demo/
from detectron2.projects.idol import add_idol_config
But it returrn this:
appuser@0916140fb4f2:~/VNext/demo$ python --config-file ../projects/IDOL/configs/ytvis19_swinL.yaml --video-input ../0b6db1c6fd.mp4 --output ../out --opts MODEL.WEIGHTS ../YTVIS19_SWINL_643AP.pth
[08/05 06:23:21 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='../projects/IDOL/configs/ytvis19_swinL.yaml', input=None, opts=['MODEL.WEIGHTS', '../YTVIS19_SWINL_643AP.pth'], output='../out', video_input='../0b6db1c6fd.mp4', webcam=False)
/home/appuser/.local/lib/python3.7/site-packages/torch/ UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[08/05 06:23:28 fvcore.common.checkpoint]: [Checkpointer] Loading from ../YTVIS19_SWINL_643AP.pth ...
[ERROR:[email protected]] global /io/opencv/modules/videoio/src/cap_ffmpeg_impl.hpp (2927) open Could not find encoder for codec_id=27, error: Encoder not found
[ERROR:[email protected]] global /io/opencv/modules/videoio/src/cap_ffmpeg_impl.hpp (3002) open VIDEOIO/FFMPEG: Failed to initialize VideoWriter
[ERROR:[email protected]] global /io/opencv/modules/videoio/src/cap.cpp (595) open VIDEOIO(CV_IMAGES): raised OpenCV exception:
OpenCV(4.6.0) /io/opencv/modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): /tmp/video_format_test3zylu0ek/test_file.mkv in function 'icvExtractPattern'
0%| | 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
File "", line 178, in <module>
for vis_frame in tqdm.tqdm(demo.run_on_video(video), total=num_frames):
File "/home/appuser/.local/lib/python3.7/site-packages/tqdm/", line 1195, in __iter__
for obj in iterable:
File "~/VNext/demo/", line 129, in run_on_video
yield process_predictions(frame, self.predictor(frame))
File "~/VNext/detectron2/engine/", line 317, in __call__
predictions = self.model([inputs])[0]
File "/home/appuser/.local/lib/python3.7/site-packages/torch/nn/modules/", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "~/VNext/projects/IDOL/idol/", line 249, in forward
video_len = len(batched_inputs[0]['file_names'])
KeyError: 'file_names'
Could you give me some hint about how to pass right batched_inputs to forward function?
same error
Same error. Please can you help us?
I am using ytvis19_swinL.yaml and YTVIS19_SWINL_643AP.pth as model weights. We can skip the 'file_names' error by removing ['file_names'] from this command: video_len = len(batched_inputs[0]['file_names']). But it remains a problem related to the predictions outputted by the model. It contains these keys: dict_keys(['image_size', 'pred_scores', 'pred_labels', 'pred_masks']) and not any of the detectron expected ones: "panoptic_seg", "instances" or "sem_seg" used for formatting qualitatively the output. Do we have to modify the detectron2 functions somehow for the IDOL configuration?
Thanks a lot
Did you find any solution for that?
I'm struggling with the some error:
video_len = len(batched_inputs[0]['file_names']) KeyError: 'file_names'
Did you find a solution please?
Not yet solved. Can the authors help here?!
I'm struggling with the some error:
video_len = len(batched_inputs[0]['file_names']) KeyError: 'file_names'
Did you find a solution please?
Hi, I have the same problem, have you solved it ? thanks
@unihornWwan sadly not yet. @aylinaydincs can you give us more detail on how you menage to train, which model , config file did you use and the parameters please. I'm struggling to launch the training.
@unihornWwan sadly not yet. @aylinaydincs can you give us more detail on how you menage to train, which model , config file did you use and the parameters please. I'm struggling to launch the training.
I simply create a conda environmetn follow the, after that I did what they say in
@unihornWwan thanks for answering.
@assia855 @unihornWwan I have the same issue to run the on a video. Have you solved it?
@assia855 @unihornWwan I have the same issue to run the on a video. Have you solved it?
I've updated the from lalalafloat to visualize on videos. I've set is_multi_cls to False to match the IDs to the pred_scores. My forked repo is over here . Cmd to infer on videos is : python projects/IDOL/ --config-file projects/IDOL/configs/ovis_swin.yaml --video-input input.mp4 --output output1.mp4
@reno77 wan I have the same issue to run the on a video. Have you solved it?
I've updated the from lalalafloat to visualize on videos. I've set is_multi_cls to False to match the IDs to the pred_scores. My forked repo is over here . Cmd to infer on videos is : python projects/IDOL/ --config-file projects/IDOL/configs/ovis_swin.yaml --video-input input.mp4 --output output1.mp4
@reno77 wan I have the same issue to run the on a video. Have you solved it?
I've updated the from lalalafloat to visualize on videos. I've set is_multi_cls to False to match the IDs to the pred_scores. My forked repo is over here . Cmd to infer on videos is : python projects/IDOL/ --config-file projects/IDOL/configs/ovis_swin.yaml --video-input input.mp4 --output output1.mp4
Hi, currently there's no mAP output since only a video is passed in. You'll need to provide the ground truth bounding boxes json file and implement your own function in to read the file in and generate mAP by finding out the IOU between the ground truth and the inferred bboxes.