camel icon indicating copy to clipboard operation
camel copied to clipboard

[Feature Request] Interface for users to include their coordinate detection algorithm

Open Wendong-Fan opened this issue 8 months ago • 1 comments

Required prerequisites

  • [X] I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
  • [ ] Consider asking first in a Discussion.

Motivation

GPT-4V can not identify coordinates correctly, so we need to allow user set their own algorithm

see discussion: https://community.openai.com/t/getting-gpt-vision-to-return-coordinates/671669/2

Solution

can transfer solution from Crab to CAMEL

https://huggingface.co/facebook/detr-resnet-50 https://huggingface.co/ShilongLiu/GroundingDINO

Alternatives

No response

Additional context

No response

Wendong-Fan avatar Jun 13 '24 16:06 Wendong-Fan