supervision icon indicating copy to clipboard operation
supervision copied to clipboard

Bug: `get_video_frames_generator` does not produce any frames.

Open likith1908 opened this issue 1 year ago • 13 comments

Foreword by @LinasKo: We'd appreciate some help from the community! There's some code to run a super-simple test on your videos. If you have the time, please check out my comment in the PR


@likith1908: So as of now my feed was at a resolution of 2868 x 1104 for which it works good enough but i have changed my feed to 3072 x 1080 now it doesn't process the frames. I have debugged by using a bunch of print statements, the program exits after initialising the byte_track

my source_video : https://drive.google.com/file/d/1FLE-YKUBQa70XBM3e2wsecp0cT0nwzur/view?usp=drive_link token.pt : https://drive.google.com/file/d/1vkj-Wax7PzfHsNCnmcqmEWXxKkstoAUp/view?usp=drive_link

MyCode

import argparse
import supervision as sv
import cv2
from ultralytics import YOLO
import numpy as np
from collections import defaultdict, deque

SOURCE = np.array([[0, 0], [3070, 0], [3070, 1080], [0, 1080]])

TARGET_WIDTH = 0.81227083231
TARGET_HEIGHT = 0.28574999964

TARGET = np.array([
    [0, 0],
    [TARGET_WIDTH, 0],
    [TARGET_WIDTH, TARGET_HEIGHT],
    [0, TARGET_HEIGHT]
])

class ViewTransformer:
    def __init__(self, source=np.ndarray, target=np.ndarray):
        source = source.astype(np.float32)
        target = target.astype(np.float32)
        self.m = cv2.getPerspectiveTransform(source, target)

    def transform_points(self, points: np.ndarray) -> np.ndarray:
        if points.size == 0:
            print("Warning: No points to transform.")
            return np.array([])
        reshaped_points = points.reshape(-1, 1, 2).astype(np.float32)
        transformed_points = cv2.perspectiveTransform(reshaped_points, self.m)
        return transformed_points.reshape(-1, 2)

def parse_arguments() -> argparse.Namespace:
    parser = argparse.ArgumentParser(
        description="Speed Estimation using Ultralytics and Supervision"
    )
    parser.add_argument(
        "--source_video_path",
        required=False,
        default="/home/harvestedlabs/Desktop/Codes/39.mp4",
        help="Path to the source video file",
        type=str
    )
    return parser.parse_args()

def main():
    args = parse_arguments()
    print(f"Source video path: {args.source_video_path}")
    
    video_info = sv.VideoInfo.from_video_path(args.source_video_path)
    print(f"Video info: {video_info}")
    
    model = YOLO("/home/harvestedlabs/Desktop/Codes/Likith/token.pt")
    print("YOLO model loaded.")
    
    byte_track = sv.ByteTrack(frame_rate=video_info.fps)
    print("ByteTrack initialized.")
    
    thickness = sv.calculate_optimal_line_thickness(resolution_wh=video_info.resolution_wh)
    text_scale = sv.calculate_optimal_text_scale(resolution_wh=video_info.resolution_wh)
    bounding_box_annotator = sv.BoundingBoxAnnotator(thickness=thickness)
    label_annotator = sv.LabelAnnotator(text_scale=text_scale, text_thickness=thickness)
    
    frame_generator = sv.get_video_frames_generator(args.source_video_path)
    polygon_zone = sv.PolygonZone(SOURCE)
    view_transformer = ViewTransformer(SOURCE, TARGET)
    coordinates = defaultdict(lambda: deque(maxlen=video_info.fps))

    frame_count = 0
    for frame in frame_generator:
        try:
            frame_count += 1
            print(f"Processing frame {frame_count}/{video_info.total_frames}")
            
            # Ensure the frame is valid
            if frame is None:
                print(f"Frame {frame_count} is None, skipping.")
                continue
            
            result = model(frame)
            print("Frame processed by model.")
            
            if not result:
                print(f"No result for frame {frame_count}, skipping.")
                continue
            
            detections = sv.Detections.from_ultralytics(result[0])
            print(f"Detections: {detections}")
            
            detections = detections[polygon_zone.trigger(detections)]
            detections = byte_track.update_with_detections(detections=detections)
            
            points = detections.get_anchors_coordinates(anchor=sv.Position.BOTTOM_CENTER)
            if points.size > 0:
                points = view_transformer.transform_points(points=points)
            else:
                print("No points detected in the frame.")
            
            labels = []
            for tracker_id, [_, y] in zip(detections.tracker_id, points):
                coordinates[tracker_id].append(y)
                if len(coordinates[tracker_id]) < video_info.fps / 2:
                    labels.append(f"#{tracker_id}")
                else:
                    coordinates_start = coordinates[tracker_id][-1]
                    coordinates_stop = coordinates[tracker_id][0]
                    distance = abs(coordinates_start - coordinates_stop)
                    time = len(coordinates[tracker_id]) / video_info.fps
                    speed = (distance / time) * 3.6

                    print(f"Tracker ID: {tracker_id}")
                    print(f"Coordinates Start: {coordinates_start}")
                    print(f"Coordinates Stop: {coordinates_stop}")
                    print(f"Distance: {distance}")
                    print(f"Time: {time}")
                    print(f"Speed: {speed} km/h")

                    labels.append(f"#{tracker_id}, {float(speed)} kmph")

            annotated_frame = frame.copy()
            annotated_frame = bounding_box_annotator.annotate(scene=annotated_frame, detections=detections)
            annotated_frame = sv.draw_polygon(annotated_frame, polygon=SOURCE, color=sv.Color.RED)
            annotated_frame = label_annotator.annotate(scene=annotated_frame, detections=detections, labels=labels)

            cv2.namedWindow("Annotated Frame", cv2.WINDOW_NORMAL)
            cv2.imshow("Annotated Frame", annotated_frame)

            if cv2.waitKey(1) == ord("q"):
                break
        except Exception as e:
            print(f"Error processing frame {frame_count}: {e}")
    
    cv2.destroyAllWindows()

if __name__ == "__main__":
    main()

Output

Source video path: /home/harvestedlabs/Desktop/Codes/39.mp4
Video info: VideoInfo(width=3072, height=1080, fps=30, total_frames=261)
YOLO model loaded.
ByteTrack initialized.

Can you help me with this issue? @LinasKo @skylargivens @iurisilvio @sberan Can you suggest a way?

Thanks Likith

Originally posted by @likith1908 in https://github.com/roboflow/supervision/discussions/1344#discussioncomment-10025628

likith1908 avatar Jul 11 '24 23:07 likith1908

Confirming the bug where no frames are produced by sv. get_video_frames_generator.

I'll look into it, I have a hunch it's something I've dealt with on a camera I had.

LinasKo avatar Jul 12 '24 08:07 LinasKo

@LinasKo Did you find any solution for this issue?

likith1908 avatar Jul 12 '24 09:07 likith1908

Yes, give me 10 min.

LinasKo avatar Jul 12 '24 09:07 LinasKo

Hi @likith1908,

Try this: pip install git+https://github.com/roboflow/supervision.git@fix/no-frames-generated-for-some-videos The get_video_frames_generator should generate the frames correctly.

Tell me if it does not fix your problem. Let's keep the issue open until we merge the PR.

Meanwhile, may I use your videos in our tests in the future?

LinasKo avatar Jul 12 '24 09:07 LinasKo

We'd appreciate some help from the community! There's some code to run a super-simple test on your videos. If you have the time, please check out my comment in the PR

LinasKo avatar Jul 12 '24 09:07 LinasKo

Hi @LinasKo

pip install is taking too long, I don't think I have access to the link you have provided->attached image!

Can you check if the repo/folder contents are public?

Screenshot 2024-07-12 at 3 16 41 PM

Sure! will keep the issue open and you can use my videos for future tests! NOTE : Download the video and upload it to your drive please, i might delete that video sometime in future!

Thanks Likith

likith1908 avatar Jul 12 '24 09:07 likith1908

This should not be reachable as a normal URL - it's only for pip. It's expected to take longer, as you are using a custom branch of supervision.

LinasKo avatar Jul 12 '24 09:07 LinasKo

Ok let me try again and I'll let you know if that works!

Just to know, how long it might take according to you @LinasKo ?

Thanks Likith

likith1908 avatar Jul 12 '24 09:07 likith1908

It took me 2 minutes with fast internet, but I have many packages cached on my system. I'd leave it for 15min or so the first time.

LinasKo avatar Jul 12 '24 09:07 LinasKo

We'd appreciate some help from the community! There's some code to run a super-simple test on your videos. If you have the time, please check out my comment in the PR

Hi @LinasKo, do you need any help in this? I can help.

Bhavay-2001 avatar Jul 12 '24 18:07 Bhavay-2001

Sure @Bhavay-2001

The majority of the work is done, but it would be useful to test on a range of different videos. Check out the PR - I left the instructions and one example response there

LinasKo avatar Jul 12 '24 18:07 LinasKo

Sure, will check it out and test it.

Bhavay-2001 avatar Jul 12 '24 18:07 Bhavay-2001

@LinasKo It's working now! I am able to see the frames processing!! I'll let you know if there are any other issues

Thanks Likith

likith1908 avatar Jul 12 '24 20:07 likith1908

Fantastic to hear that @likith1908! 🔥 I'm closing this issue.

SkalskiP avatar Jul 17 '24 10:07 SkalskiP

Sure!

likith1908 avatar Jul 17 '24 10:07 likith1908