rvos
rvos copied to clipboard
zero shot model results: 10 instances highly redundant
when I use zero-shot pre-trained model to run my own videos, the model output 10 instances which is highly redundant(usually highly overlapped at one object),how can we choose the best object result?
In that case, the reason could be that the model is not able to find any other relevant instance in the image and I would choose the first mask given by the decoder.
Best regards,
Carles