spring-ai icon indicating copy to clipboard operation
spring-ai copied to clipboard

Add support for GPT-4 with Vision

Open ghillert opened this issue 1 year ago • 5 comments
trafficstars

Having experimented with OpenAI's GPT-4 with Vision API, it would be amazing if Spring AI adds support for image-based input data (e.g. photos). This API allows you to post:

  • One or more images
  • Images are represented as base64 encoded data or as image URL

The API is located at: API reference: https://platform.openai.com/docs/guides/vision

An implementation could possibly also provide additional pre-processing in order to add support for video data as described here:

https://cookbook.openai.com/examples/gpt_with_vision_for_video_understanding

ghillert avatar Dec 04 '23 20:12 ghillert

Pull requests welcome.

markpollack avatar Dec 04 '23 20:12 markpollack

It looks like this feature would depend on https://github.com/TheoKanning/openai-java/issues/397

ghillert avatar Dec 05 '23 20:12 ghillert

We have moved away from depending on that library and will provide our own implementations. ATM, brainstorming on having the API incorporate more than just chat models. Will be able to share a spike branch shortly. Was just circling back to issues filed wrt to other AI models. And happy new year Gunnar!

markpollack avatar Jan 14 '24 16:01 markpollack

How is the spike branch going? Is there a way others could help contribute?

arnokoehler avatar Apr 16 '24 11:04 arnokoehler

Same issue. Is there any progress?

Mr-LiuDC avatar Apr 25 '24 13:04 Mr-LiuDC

It should be possible to close this issue since multimodality support has been introduced in Spring AI 1.0.0-M1, including the possibility to pass images as input to OpenAI: https://docs.spring.io/spring-ai/reference/api/multimodality.html

Example: https://github.com/ThomasVitale/llm-apps-java-spring-ai/tree/main/02-prompts/prompts-multimodality-openai

ThomasVitale avatar Jul 10 '24 12:07 ThomasVitale

Closing the issue. Thanks Thomas, issue pruning has been overdue, working through it now.

markpollack avatar Jul 22 '24 14:07 markpollack