supervision icon indicating copy to clipboard operation
supervision copied to clipboard

Allow mask to be propagated within the box during tracking

Open AntonioConsiglio opened this issue 5 months ago • 8 comments

Description:

This pull request addresses the following changes:

  1. ByteTrack Enhancement:

    • Modified the Detection updating functions to retain information about the mask #754 .
  2. Base Code Modification:

    • Made adjustments to the base code to resolve the issue.

Changes Made:

  • byte_tracker.update_with_detections method update the tracker id of detections based on ByteTrack tracker.

Minor Changes Made:

  • box_annotator.annotate method added a flag (only_tracked) to show only bbox with tracker id assigned

Testing:

colab: LinkToNotebook Successfully tested the code with both object-detection and segmentation-models using the provided testing code:

import supervision as sv
from tqdm import tqdm
from ultralytics import YOLO

VIDEO_ROOT = "./walking_people.mp4"

class CustomDetections(sv.Detections):
    def __init__(self,*args,**kwargs):
        super().__init__(*args,**kwargs)
        self.class_name:list = []


if __name__ == "__main__":

    model = YOLO("yolov8x-seg.pt").cuda()  
    #model = YOLO("yolov8x.pt").cuda() 

    mask_annotator = sv.MaskAnnotator()
    box_annotator = sv.BoxAnnotator(thickness=1,
                                text_scale=0.2,
                                text_padding=4)
    track_annotator = sv.TraceAnnotator()

    video_info = sv.VideoInfo.from_video_path(video_path=VIDEO_ROOT)
    frame_generator = sv.get_video_frames_generator(source_path=VIDEO_ROOT)
    byte_tracker = sv.ByteTrack()

    with sv.VideoSink(f'test_box_and_mask_tracking.mp4', video_info=video_info) as sink:
  
        for frame in tqdm(frame_generator, total=video_info.total_frames):
            
            result = model.track(frame, imgsz = 640,verbose=False)[0]
            detections = CustomDetections.from_ultralytics(result)
            detections = byte_tracker.update_with_detections(detections)

            annotated_frame = box_annotator.annotate(
                scene=frame.copy(),
                detections=detections)
            annotated_frame = track_annotator.annotate(
                scene=annotated_frame.copy(),
                detections=detections)
            annotated_frame = mask_annotator.annotate(
                scene=annotated_frame.copy(),
                detections=detections)
           
            sink.write_frame(frame=annotated_frame)

AntonioConsiglio avatar Feb 05 '24 01:02 AntonioConsiglio

Hi @AntonioConsiglio 👋🏻 I am not sure if my instructions are unclear, so I decided to prepare a Colab for you illustrating what we need.

As far as I know, this is all we need 👇🏻 Am I missing something?

def tracks2boxes(tracks: List[STrack]) -> np.ndarray:
    return np.array([
        track.tlbr
        for track
        in tracks
    ], dtype=float)


def match_detections_with_tracks(
    detections: sv.Detections,
    tracks: List[STrack]
) -> np.ndarray:
    if not np.any(detections.xyxy) or len(tracks) == 0:
        return np.empty(0)

    tracks_boxes = tracks2boxes(tracks=tracks)
    iou = sv.box_iou_batch(tracks_boxes, detections.xyxy)
    track2detection = np.argmax(iou, axis=1)

    tracker_ids = [-1] * len(detections)

    for tracker_index, detection_index in enumerate(track2detection):
        if iou[tracker_index, detection_index] != 0:
            tracker_ids[detection_index] = tracks[tracker_index].track_id

    return np.array(tracker_ids)


def update_with_detections(tracker: sv.ByteTrack, detections: sv.Detections, keep_all: bool = False) -> sv.Detections:
    tensors = detections2boxes(detections=detections)
    tracks = byte_tracker.update_with_tensors(tensors=tensors)
    detections.tracker_id = match_detections_with_tracks(detections=detections, tracks=tracks)

    if not keep_all:
        detections = detections[detections.tracker_id != -1]

    return detections

SkalskiP avatar Feb 05 '24 23:02 SkalskiP

Hi @AntonioConsiglio 👋🏻 I am not sure if my instructions are unclear, so I decided to prepare a Colab for you illustrating what we need.

As far as I know, this is all we need 👇🏻 Am I missing something?

def tracks2boxes(tracks: List[STrack]) -> np.ndarray:
    return np.array([
        track.tlbr
        for track
        in tracks
    ], dtype=float)


def match_detections_with_tracks(
    detections: sv.Detections,
    tracks: List[STrack]
) -> np.ndarray:
    if not np.any(detections.xyxy) or len(tracks) == 0:
        return np.empty(0)

    tracks_boxes = tracks2boxes(tracks=tracks)
    iou = sv.box_iou_batch(tracks_boxes, detections.xyxy)
    track2detection = np.argmax(iou, axis=1)

    tracker_ids = [-1] * len(detections)

    for tracker_index, detection_index in enumerate(track2detection):
        if iou[tracker_index, detection_index] != 0:
            tracker_ids[detection_index] = tracks[tracker_index].track_id

    return np.array(tracker_ids)


def update_with_detections(tracker: sv.ByteTrack, detections: sv.Detections, keep_all: bool = False) -> sv.Detections:
    tensors = detections2boxes(detections=detections)
    tracks = byte_tracker.update_with_tensors(tensors=tensors)
    detections.tracker_id = match_detections_with_tracks(detections=detections, tracks=tracks)

    if not keep_all:
        detections = detections[detections.tracker_id != -1]

    return detections

Hi @SkalskiP, this work is already done in the function I've modified. ByteTrack method already does the match using the mIoU between the tracklet and new detections. In my opinion It doesn't make sense to do it another time after the creation of the new Tracklet (which in your code is the output of the tracker).

Justo to be clear, this is the ByteTrack pipeline with my change:

  • Input: new detections
  • get tensors from detections
  • split from high score and second detections
  • transformation of detection in new Tracklet or STracks object
  • Execute the match between the old tracklet(tracked and lost ones) and new STracks (which are the new detections)
  • Assign the tracker ID to the detections based on the previous step

In addition I can filter the detection as you have done in this line:

if not keep_all: detections = detections[detections.tracker_id != -1]

Why do you want to split the process? Basically instead of this step: - Assign the tracker ID to the detection based on the previous step You want to perform another IoU matching to assign IDs from these new Tracklet and Detections, but basically it is already done.

AntonioConsiglio avatar Feb 06 '24 07:02 AntonioConsiglio

Why do you want to split the process?

Most importantly, because you changed API. The update_with_tensors method is a public API and must stay unchanged.

SkalskiP avatar Feb 06 '24 09:02 SkalskiP

update_with_tensors

Ok, now your pov is clear for me. The question is, make it sense to have update_with_tensors API public?

@SkalskiP what do you think to leave update_with_tensors API as it is (I didn't realize to have deleted the method) an leave the proposed solution with update_detections (that does the update_with_tensors process) as a private method?

In this case we will not redo the IoU matching and will have the result we want.

basically people that where using update_with_tensors API will not be affected, and also people that are using update_with_detections. Ideally who is using update_with_detection is not using update_with_tensors (I hope because it doesn't make sense) or am I wrong?

AntonioConsiglio avatar Feb 06 '24 10:02 AntonioConsiglio

The question is, make it sense to have update_with_tensors API public?

update_with_tensors is public as it is use internally by Roboflow team.

what do you think to leave update_with_tensors API as it is

I would need to see a specific solution.

SkalskiP avatar Feb 06 '24 11:02 SkalskiP

@SkalskiP I've pushed the solution.

AntonioConsiglio avatar Feb 06 '24 14:02 AntonioConsiglio

Hi @AntonioConsiglio 👋🏻 As usual, thanks for all the time and effort you put into making this PR.

This PR significantly increases the complexity of the ByteTrack code. We added about 300 lines of code (most are copied). In your solution, update_with_detections no longer uses update_with_tensors. If you want to preserve this architecture then we need to separate the copied code into a separate method and use it in both update_with_tensors and update_with_detections.

SkalskiP avatar Feb 08 '24 12:02 SkalskiP

@SkalskiP have you checked the proposed solution?

AntonioConsiglio avatar Feb 13 '24 13:02 AntonioConsiglio

Hi @AntonioConsiglio 👋🏻 I'm closing this issue as the solution is now implemented via https://github.com/roboflow/supervision/pull/1035. Thanks a lot for all the effort @AntonioConsiglio 🙏🏻

SkalskiP avatar Mar 25 '24 15:03 SkalskiP