ReferFormer
ReferFormer copied to clipboard
Any ideas about modifying model to detect multiple objects which are described by one query?
Hi,
I want to extend this model to match the following situation:
Based on one text query, e.g. "a person skateboarding", I want to search in the video clip to find out all objects which match this query. For example, person 1 is skateboarding between frame 5 to frame 15, person 2 is skateboarding between frame 7 to frame 24. => Then on the output, it will show person 1 between frame 5 to frame 6, person 1 and 2 between frame 7 to frame 15, person 2 between frame 16 to frame 24.
For now, in inference_ytvos.py, it uses
max_scores, _ = pred_scores.max(-1) # [q,]
_, max_ind = max_scores.max(-1) # [1,]
to get the maximum one. How to modify them and related files?
Any ideas?
Thank you in advance!