using box prompt for video
Greetings,
Thank you for open sourcing this project. I noticed that the video example did not contain a sample on how to add box prompts. From what I can tell, the Sam3VideoPredictor allows for this type of input but I seem to be having difficulty getting it to work.
For instance in the video example notebook at the end of the "Removing objects" section, I added this code:
obj_id = 2
boxes = [out["out_boxes_xywh"][2].tolist()]
response = predictor.handle_request(
request={
"type": "add_prompt",
"session_id": session_id,
"frame_index": frame_idx,
"bounding_boxes": boxes,
"bounding_box_labels": [1],
}
)
The variable out is from the "Video promptable concept segmentation with text" and is the response from the predictor when adding a text prompt so I just pulled out that box as an example to use for attempting to do a box prompt.
The resulting visualization though does not seem have the associated box. Would you mind helping me see where I am going wrong? Thanks!
Hi, have you solved this problem?
Hi, have you solved this problem?
I have not yet solved it.
The issue is that the hotstart heuristic kills the track when you provide a box prompt on a single frame. A text prompt is global in the sense that it is automatically run on every frame, i.e. there will be raw detections on most frames to be matched to existing tracks. However, with a box prompt on a single frame, there won't be any raw detections in the other frames for matching. The hotstart heuristic notices the track goes unmatched for a long time and deletes it. A workaround which worked for me is to disable the heuristic altogether, e.g. by setting predictor.model.hotstart_delay to 0.
Thanks! That didn't seem to work for me unfortunately but I'll keep looking at it.
Thanks @bolgarbe, the predictor.model.hotstart_delay = 0 setting works
I wanted to add on that I was running into the same problem and can confirm that setting predictor.model.hotstart_delay = 0 did work
It works for me too. Thanks @bolgarbe