supervision
supervision copied to clipboard
Bug: `get_video_frames_generator` does not produce any frames.
Foreword by @LinasKo: We'd appreciate some help from the community! There's some code to run a super-simple test on your videos. If you have the time, please check out my comment in the PR
@likith1908:
So as of now my feed was at a resolution of 2868 x 1104 for which it works good enough but i have changed my feed to 3072 x 1080 now it doesn't process the frames. I have debugged by using a bunch of print statements, the program exits after initialising the byte_track
my source_video : https://drive.google.com/file/d/1FLE-YKUBQa70XBM3e2wsecp0cT0nwzur/view?usp=drive_link token.pt : https://drive.google.com/file/d/1vkj-Wax7PzfHsNCnmcqmEWXxKkstoAUp/view?usp=drive_link
MyCode
import argparse
import supervision as sv
import cv2
from ultralytics import YOLO
import numpy as np
from collections import defaultdict, deque
SOURCE = np.array([[0, 0], [3070, 0], [3070, 1080], [0, 1080]])
TARGET_WIDTH = 0.81227083231
TARGET_HEIGHT = 0.28574999964
TARGET = np.array([
[0, 0],
[TARGET_WIDTH, 0],
[TARGET_WIDTH, TARGET_HEIGHT],
[0, TARGET_HEIGHT]
])
class ViewTransformer:
def __init__(self, source=np.ndarray, target=np.ndarray):
source = source.astype(np.float32)
target = target.astype(np.float32)
self.m = cv2.getPerspectiveTransform(source, target)
def transform_points(self, points: np.ndarray) -> np.ndarray:
if points.size == 0:
print("Warning: No points to transform.")
return np.array([])
reshaped_points = points.reshape(-1, 1, 2).astype(np.float32)
transformed_points = cv2.perspectiveTransform(reshaped_points, self.m)
return transformed_points.reshape(-1, 2)
def parse_arguments() -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Speed Estimation using Ultralytics and Supervision"
)
parser.add_argument(
"--source_video_path",
required=False,
default="/home/harvestedlabs/Desktop/Codes/39.mp4",
help="Path to the source video file",
type=str
)
return parser.parse_args()
def main():
args = parse_arguments()
print(f"Source video path: {args.source_video_path}")
video_info = sv.VideoInfo.from_video_path(args.source_video_path)
print(f"Video info: {video_info}")
model = YOLO("/home/harvestedlabs/Desktop/Codes/Likith/token.pt")
print("YOLO model loaded.")
byte_track = sv.ByteTrack(frame_rate=video_info.fps)
print("ByteTrack initialized.")
thickness = sv.calculate_optimal_line_thickness(resolution_wh=video_info.resolution_wh)
text_scale = sv.calculate_optimal_text_scale(resolution_wh=video_info.resolution_wh)
bounding_box_annotator = sv.BoundingBoxAnnotator(thickness=thickness)
label_annotator = sv.LabelAnnotator(text_scale=text_scale, text_thickness=thickness)
frame_generator = sv.get_video_frames_generator(args.source_video_path)
polygon_zone = sv.PolygonZone(SOURCE)
view_transformer = ViewTransformer(SOURCE, TARGET)
coordinates = defaultdict(lambda: deque(maxlen=video_info.fps))
frame_count = 0
for frame in frame_generator:
try:
frame_count += 1
print(f"Processing frame {frame_count}/{video_info.total_frames}")
# Ensure the frame is valid
if frame is None:
print(f"Frame {frame_count} is None, skipping.")
continue
result = model(frame)
print("Frame processed by model.")
if not result:
print(f"No result for frame {frame_count}, skipping.")
continue
detections = sv.Detections.from_ultralytics(result[0])
print(f"Detections: {detections}")
detections = detections[polygon_zone.trigger(detections)]
detections = byte_track.update_with_detections(detections=detections)
points = detections.get_anchors_coordinates(anchor=sv.Position.BOTTOM_CENTER)
if points.size > 0:
points = view_transformer.transform_points(points=points)
else:
print("No points detected in the frame.")
labels = []
for tracker_id, [_, y] in zip(detections.tracker_id, points):
coordinates[tracker_id].append(y)
if len(coordinates[tracker_id]) < video_info.fps / 2:
labels.append(f"#{tracker_id}")
else:
coordinates_start = coordinates[tracker_id][-1]
coordinates_stop = coordinates[tracker_id][0]
distance = abs(coordinates_start - coordinates_stop)
time = len(coordinates[tracker_id]) / video_info.fps
speed = (distance / time) * 3.6
print(f"Tracker ID: {tracker_id}")
print(f"Coordinates Start: {coordinates_start}")
print(f"Coordinates Stop: {coordinates_stop}")
print(f"Distance: {distance}")
print(f"Time: {time}")
print(f"Speed: {speed} km/h")
labels.append(f"#{tracker_id}, {float(speed)} kmph")
annotated_frame = frame.copy()
annotated_frame = bounding_box_annotator.annotate(scene=annotated_frame, detections=detections)
annotated_frame = sv.draw_polygon(annotated_frame, polygon=SOURCE, color=sv.Color.RED)
annotated_frame = label_annotator.annotate(scene=annotated_frame, detections=detections, labels=labels)
cv2.namedWindow("Annotated Frame", cv2.WINDOW_NORMAL)
cv2.imshow("Annotated Frame", annotated_frame)
if cv2.waitKey(1) == ord("q"):
break
except Exception as e:
print(f"Error processing frame {frame_count}: {e}")
cv2.destroyAllWindows()
if __name__ == "__main__":
main()
Output
Source video path: /home/harvestedlabs/Desktop/Codes/39.mp4
Video info: VideoInfo(width=3072, height=1080, fps=30, total_frames=261)
YOLO model loaded.
ByteTrack initialized.
Can you help me with this issue? @LinasKo @skylargivens @iurisilvio @sberan Can you suggest a way?
Thanks Likith
Originally posted by @likith1908 in https://github.com/roboflow/supervision/discussions/1344#discussioncomment-10025628
Confirming the bug where no frames are produced by sv. get_video_frames_generator.
I'll look into it, I have a hunch it's something I've dealt with on a camera I had.
@LinasKo Did you find any solution for this issue?
Yes, give me 10 min.
Hi @likith1908,
Try this: pip install git+https://github.com/roboflow/supervision.git@fix/no-frames-generated-for-some-videos
The get_video_frames_generator should generate the frames correctly.
Tell me if it does not fix your problem. Let's keep the issue open until we merge the PR.
Meanwhile, may I use your videos in our tests in the future?
We'd appreciate some help from the community! There's some code to run a super-simple test on your videos. If you have the time, please check out my comment in the PR
Hi @LinasKo
pip install is taking too long, I don't think I have access to the link you have provided->attached image!
Can you check if the repo/folder contents are public?
Sure! will keep the issue open and you can use my videos for future tests! NOTE : Download the video and upload it to your drive please, i might delete that video sometime in future!
Thanks Likith
This should not be reachable as a normal URL - it's only for pip. It's expected to take longer, as you are using a custom branch of supervision.
Ok let me try again and I'll let you know if that works!
Just to know, how long it might take according to you @LinasKo ?
Thanks Likith
It took me 2 minutes with fast internet, but I have many packages cached on my system. I'd leave it for 15min or so the first time.
We'd appreciate some help from the community! There's some code to run a super-simple test on your videos. If you have the time, please check out my comment in the PR
Hi @LinasKo, do you need any help in this? I can help.
Sure @Bhavay-2001
The majority of the work is done, but it would be useful to test on a range of different videos. Check out the PR - I left the instructions and one example response there
Sure, will check it out and test it.
@LinasKo It's working now! I am able to see the frames processing!! I'll let you know if there are any other issues
Thanks Likith
Fantastic to hear that @likith1908! 🔥 I'm closing this issue.
Sure!