RefSAM icon indicating copy to clipboard operation
RefSAM copied to clipboard

how's the performance on refcoco?

Open yxchng opened this issue 2 years ago • 2 comments

yxchng avatar Apr 07 '23 03:04 yxchng

It is taking a while to run, I'll probably check the results sometime during the weekend.

The initial results of this approach are fairly poor. I think the reason for this is that many of the RefCOCO text prompts involve spatial relations like "the man to the left of the ...". CLIP does not have the ability to contextualize local regions within an image.

helblazer811 avatar Apr 07 '23 03:04 helblazer811

Hello, I also utilize the clip model to classify the masks from SAM. However, I find the performance is poor. Increasing the image size of the clip model may improve the recognition accuracy of each mask.

PengtaoJiang avatar Apr 10 '23 02:04 PengtaoJiang