Grounded-Segment-Anything
Grounded-Segment-Anything copied to clipboard
Is the anyway that I could detect everything that the model knew in the picture?
Like the usual model can do like yolov5, to detect everything.
Like the usual model can do like yolov5, to detect everything.
In theory, model can detect every language input, for the common case like 80 categories in COCO, we evaluate GroundingDINO by concat all the category name with ., you may input the language prompt as:
person. cat. dog. ...
And see if the model can detect them correctly, you should also be careful with the box threshold and text threshold, which may influence the output results
Thanks for your reply. How does the text threshold work? What will be effected by it?
Hi @aixiaodewugege , Hope you have already found the answer. For others, here is the helpful link for input/output https://github.com/IDEA-Research/GroundingDINO#star-explanationstips-for-grounding-dino-inputs-and-outputs