supervision
supervision copied to clipboard
Allow mask to be propagated within the box during tracking
Description:
This pull request addresses the following changes:
-
ByteTrack Enhancement:
- Modified the Detection updating functions to retain information about the mask #754 .
-
Base Code Modification:
- Made adjustments to the base code to resolve the issue.
Changes Made:
- byte_tracker.update_with_detections method update the tracker id of detections based on ByteTrack tracker.
Minor Changes Made:
- box_annotator.annotate method added a flag (only_tracked) to show only bbox with tracker id assigned
Testing:
colab: LinkToNotebook Successfully tested the code with both object-detection and segmentation-models using the provided testing code:
import supervision as sv
from tqdm import tqdm
from ultralytics import YOLO
VIDEO_ROOT = "./walking_people.mp4"
class CustomDetections(sv.Detections):
def __init__(self,*args,**kwargs):
super().__init__(*args,**kwargs)
self.class_name:list = []
if __name__ == "__main__":
model = YOLO("yolov8x-seg.pt").cuda()
#model = YOLO("yolov8x.pt").cuda()
mask_annotator = sv.MaskAnnotator()
box_annotator = sv.BoxAnnotator(thickness=1,
text_scale=0.2,
text_padding=4)
track_annotator = sv.TraceAnnotator()
video_info = sv.VideoInfo.from_video_path(video_path=VIDEO_ROOT)
frame_generator = sv.get_video_frames_generator(source_path=VIDEO_ROOT)
byte_tracker = sv.ByteTrack()
with sv.VideoSink(f'test_box_and_mask_tracking.mp4', video_info=video_info) as sink:
for frame in tqdm(frame_generator, total=video_info.total_frames):
result = model.track(frame, imgsz = 640,verbose=False)[0]
detections = CustomDetections.from_ultralytics(result)
detections = byte_tracker.update_with_detections(detections)
annotated_frame = box_annotator.annotate(
scene=frame.copy(),
detections=detections)
annotated_frame = track_annotator.annotate(
scene=annotated_frame.copy(),
detections=detections)
annotated_frame = mask_annotator.annotate(
scene=annotated_frame.copy(),
detections=detections)
sink.write_frame(frame=annotated_frame)
Hi @AntonioConsiglio 👋🏻 I am not sure if my instructions are unclear, so I decided to prepare a Colab for you illustrating what we need.
As far as I know, this is all we need 👇🏻 Am I missing something?
def tracks2boxes(tracks: List[STrack]) -> np.ndarray:
return np.array([
track.tlbr
for track
in tracks
], dtype=float)
def match_detections_with_tracks(
detections: sv.Detections,
tracks: List[STrack]
) -> np.ndarray:
if not np.any(detections.xyxy) or len(tracks) == 0:
return np.empty(0)
tracks_boxes = tracks2boxes(tracks=tracks)
iou = sv.box_iou_batch(tracks_boxes, detections.xyxy)
track2detection = np.argmax(iou, axis=1)
tracker_ids = [-1] * len(detections)
for tracker_index, detection_index in enumerate(track2detection):
if iou[tracker_index, detection_index] != 0:
tracker_ids[detection_index] = tracks[tracker_index].track_id
return np.array(tracker_ids)
def update_with_detections(tracker: sv.ByteTrack, detections: sv.Detections, keep_all: bool = False) -> sv.Detections:
tensors = detections2boxes(detections=detections)
tracks = byte_tracker.update_with_tensors(tensors=tensors)
detections.tracker_id = match_detections_with_tracks(detections=detections, tracks=tracks)
if not keep_all:
detections = detections[detections.tracker_id != -1]
return detections
Hi @AntonioConsiglio 👋🏻 I am not sure if my instructions are unclear, so I decided to prepare a Colab for you illustrating what we need.
As far as I know, this is all we need 👇🏻 Am I missing something?
def tracks2boxes(tracks: List[STrack]) -> np.ndarray: return np.array([ track.tlbr for track in tracks ], dtype=float) def match_detections_with_tracks( detections: sv.Detections, tracks: List[STrack] ) -> np.ndarray: if not np.any(detections.xyxy) or len(tracks) == 0: return np.empty(0) tracks_boxes = tracks2boxes(tracks=tracks) iou = sv.box_iou_batch(tracks_boxes, detections.xyxy) track2detection = np.argmax(iou, axis=1) tracker_ids = [-1] * len(detections) for tracker_index, detection_index in enumerate(track2detection): if iou[tracker_index, detection_index] != 0: tracker_ids[detection_index] = tracks[tracker_index].track_id return np.array(tracker_ids) def update_with_detections(tracker: sv.ByteTrack, detections: sv.Detections, keep_all: bool = False) -> sv.Detections: tensors = detections2boxes(detections=detections) tracks = byte_tracker.update_with_tensors(tensors=tensors) detections.tracker_id = match_detections_with_tracks(detections=detections, tracks=tracks) if not keep_all: detections = detections[detections.tracker_id != -1] return detections
Hi @SkalskiP, this work is already done in the function I've modified. ByteTrack method already does the match using the mIoU between the tracklet and new detections. In my opinion It doesn't make sense to do it another time after the creation of the new Tracklet (which in your code is the output of the tracker).
Justo to be clear, this is the ByteTrack pipeline with my change:
- Input: new detections
- get tensors from detections
- split from high score and second detections
- transformation of detection in new Tracklet or STracks object
- Execute the match between the old tracklet(tracked and lost ones) and new STracks (which are the new detections)
- Assign the tracker ID to the detections based on the previous step
In addition I can filter the detection as you have done in this line:
if not keep_all: detections = detections[detections.tracker_id != -1]
Why do you want to split the process? Basically instead of this step: - Assign the tracker ID to the detection based on the previous step You want to perform another IoU matching to assign IDs from these new Tracklet and Detections, but basically it is already done.
Why do you want to split the process?
Most importantly, because you changed API. The update_with_tensors
method is a public API and must stay unchanged.
update_with_tensors
Ok, now your pov is clear for me. The question is, make it sense to have update_with_tensors API public?
@SkalskiP what do you think to leave update_with_tensors API as it is (I didn't realize to have deleted the method) an leave the proposed solution with update_detections (that does the update_with_tensors process) as a private method?
In this case we will not redo the IoU matching and will have the result we want.
basically people that where using update_with_tensors API will not be affected, and also people that are using update_with_detections. Ideally who is using update_with_detection is not using update_with_tensors (I hope because it doesn't make sense) or am I wrong?
The question is, make it sense to have update_with_tensors API public?
update_with_tensors
is public as it is use internally by Roboflow team.
what do you think to leave update_with_tensors API as it is
I would need to see a specific solution.
@SkalskiP I've pushed the solution.
Hi @AntonioConsiglio 👋🏻 As usual, thanks for all the time and effort you put into making this PR.
This PR significantly increases the complexity of the ByteTrack code. We added about 300 lines of code (most are copied). In your solution, update_with_detections
no longer uses update_with_tensors
. If you want to preserve this architecture then we need to separate the copied code into a separate method and use it in both update_with_tensors
and update_with_detections
.
@SkalskiP have you checked the proposed solution?
Hi @AntonioConsiglio 👋🏻 I'm closing this issue as the solution is now implemented via https://github.com/roboflow/supervision/pull/1035. Thanks a lot for all the effort @AntonioConsiglio 🙏🏻