natural-language-youtube-search icon indicating copy to clipboard operation
natural-language-youtube-search copied to clipboard

Strange CLIP results

Open smithee77 opened this issue 3 years ago • 1 comments

Hi, first of all many thanks for this wonderful script :) I've trying some searchs, and I found some strange results. Probably is how CLIP words, but not sure.

If I search "CAR" (there are a lot of cars in the video), and if I look at the value of the frame with the best similarity, I get, e.g. 26.65 Then I search something stupid like "sdfsdflksdfj", and I check at the same value...I was expecting to get a near-to-zero value, but instead I get, e.g. 21.55.
Is this a bug? Or is the way CLIP works? Is there a way to detect how good the prediction is? Many thanks!

smithee77 avatar Apr 11 '21 17:04 smithee77

I think this is how CLIP works. I've observed similar behavior - you don't really know, when CLIP doesn't know :)

I guess he reason is that CLIP was not trained for that, so it may not be easy to interpret the scores, except saying which one is higher (= better match).

haltakov avatar Apr 11 '21 20:04 haltakov