dspy
dspy copied to clipboard
Add GPT-4 Vision API wrapper
Introduce a new GPT4Vision class in visionopenai.py that wraps the GPT-4 Vision API. This abstraction layer simplifies the process of making requests to the API for analyzing images. Key functionality includes:
Encoding images to base64 Preparing image metadata in the required format Calculating token costs for images Making API requests with both text and image prompts Error handling and validation
Also is there any way to add tests for this?
Also is there any way to add tests for this?
I have a unittest created.
I have a unittest created.
Where?
Adding GPT-4 Vision capabilties would be super useful! Any progress with fixing the failing check here? 🙂
Did you try whether this really works or not?
What does the Signature look like to pass an image?
Thanks @jmanhype for the contribution! Could you resolve https://github.com/stanfordnlp/dspy/pull/682#pullrequestreview-1950064068?
Also, it would be great if you add documentation for this under the documentation folder as with the other LMs. Thanks!
I just made those updates
@jmanhype
Also run "ruff check . --fix-only"
add documentation for this under the documentation folder as with the other LMs. Thanks!
Done
@jmanhype did you push? don't see new commits.
ok everything has been updated
Hi @jmanhype still needs ruff check . --fix-only to be run.
Also, to properly call dspy.GPT4Vision, you need to import it as the other LM modules:
Ruff check is false positive it passes locally
@arnavsinghvi11 Is there any update on this? cc : @jmanhype
Just needs to be merged the ruff error is a false positive
Hey @jmanhype I did actually find a ruff error and I added GPT4Vision to the init files in dsp and dspy like @arnavsinghvi11 mentioned.
See this branch: https://github.com/stanfordnlp/dspy/tree/jmanhype/main