anthropic-cookbook Bounding Box Detection

Bounding Box Detection

Open batu opened this issue 2 months ago • 0 comments

Hello!

After seeing that sonnet is trained for computer use (with exact pixel coordinates) I tried using it for bounding box detection (both open vocab with text input, or few-shot with image input). However, my results have been worse than I expected given claude's performance with computer use. I tried following the best practices outlined in this repo.

My question to you is:

Can you share what specific normalization/origin location is claude for computer use trained for? So I can use the same set up.
Any bb grounding related suggestions I should try beyond what is given in the cookbooks.

Thank you very much!

Dec 19 '24 20:12 batu

anthropic-cookbook anthropic-cookbook copied to clipboard

Bounding Box Detection

anthropic-cookbook
anthropic-cookbook copied to clipboard