Grounded-Segment-Anything icon indicating copy to clipboard operation
Grounded-Segment-Anything copied to clipboard

当图中没有prompt提示的物品时,会把最大的物品当做prompt提示的物品

Open tarepanda1024 opened this issue 2 years ago • 7 comments

输入的prompt为sun /cat/ dog 等。图中圈出来的都有问题:

image

image

image

下面这个是原图: 91ac6a31f5c5354578133315dab708ed

tarepanda1024 avatar Dec 04 '23 13:12 tarepanda1024

@rentainhe can you help me or give me some advice ?

tarepanda1024 avatar Dec 05 '23 07:12 tarepanda1024

@rentainhe can you help me or give me some advice ?

There does appear to be an issue with the control over counterexamples in the Grounding-DINO model. This may be due to the model's weights. It might be worth trying better weights to see if it alleviates such a problem.

rentainhe avatar Dec 05 '23 16:12 rentainhe

@rentainhe can you help me or give me some advice ?

There does appear to be an issue with the control over counterexamples in the Grounding-DINO model. This may be due to the model's weights. It might be worth trying better weights to see if it alleviates such a problem.

Thx, i will try with another model weight.

tarepanda1024 avatar Dec 06 '23 01:12 tarepanda1024

@rentainhe can you help me or give me some advice ?

There does appear to be an issue with the control over counterexamples in the Grounding-DINO model. This may be due to the model's weights. It might be worth trying better weights to see if it alleviates such a problem.

Sorry, could you please confirm again if you are referring to replacing the model or adjusting the parameters in GroundingDINO_SwinB.cfg.py?

Need i change models blow or adjusting config? image

image

tarepanda1024 avatar Dec 06 '23 02:12 tarepanda1024

My text prompt is 1cat . photos blow are all recogize failed.

image image image image image

tarepanda1024 avatar Dec 06 '23 02:12 tarepanda1024

我个人实践,在openset上用Grounding-DINO在上做开放目标检测,有些理解

  1. text prompt尽量多测试,并且用地道英语(1cat我都不太能理解),一般框都挺准,但可能和text对不上
  2. box thresh可以调高些,但text thresh过高会出现断词的现象
  3. 对于box占全图过大的case就过滤掉
  4. 加些启发式联合过滤,比如衣服一定有人脸
  5. openset的zero short 不可避免地会有误检,只能说在大数据范围内准确率还有个60%多,剩下的还得double check

NormanBeta avatar Dec 06 '23 04:12 NormanBeta

我个人实践,在openset上用Grounding-DINO在上做开放目标检测,有些理解

  1. text prompt尽量多测试,并且用地道英语(1cat我都不太能理解),一般框都挺准,但可能和text对不上
  2. box thresh可以调高些,但text thresh过高会出现断词的现象
  3. 对于box占全图过大的case就过滤掉
  4. 加些启发式联合过滤,比如衣服一定有人脸
  5. openset的zero short 不可避免地会有误检,只能说在大数据范围内准确率还有个60%多,剩下的还得double check

好的,感谢~

tarepanda1024 avatar Dec 06 '23 05:12 tarepanda1024