GroundingDINO icon indicating copy to clipboard operation
GroundingDINO copied to clipboard

Multi-object caption has negative effect on detection results.

Open hotelll opened this issue 1 year ago • 1 comments

I am using GroundingDINO to detect object from image. However, I found that an object can be found with caption "ping pong.", but cannot be found with caption "man. ping pong.". The results are as follows:

  1. caption: "ping pong" box_threshold=0.3 image

  2. caption: "man. ping pong." box_threshold=0.3 image

  3. caption: "man. ping pong." box_threshold=0.2 image

I wonder why this happened, and how to solve/ease this issue? Thanks!

hotelll avatar May 08 '24 11:05 hotelll

I am having similar issues, have anyone found the solution?

yunbinmo avatar Feb 21 '25 03:02 yunbinmo