matrix-chatgpt-bot icon indicating copy to clipboard operation
matrix-chatgpt-bot copied to clipboard

Support `gpt-4-vision-preview`

Open PaarthShah opened this issue 2 years ago • 5 comments
trafficstars

https://platform.openai.com/docs/guides/vision

It seems like uploading base64-encoded images may be a generic viable strategy for passing images through the API.

Alternatively/for speed, and from unencrypted rooms, it may instead be possible/desirable to pass an image URL by transforming the image mxc url to an https url via the image_url key.

PaarthShah avatar Nov 07 '23 06:11 PaarthShah

As far as I can tell we're limited by the library we use for API communication, which does not yet support vision. Although I'm very interested and will check what we can do as soon as the library adds support.

max298 avatar Nov 07 '23 08:11 max298

I open up a request.

Dual-0 avatar Nov 07 '23 10:11 Dual-0

I think we might consider dropping the third party SDK and switch to the official node package from openai: https://github.com/openai/openai-node#readme which seems to support vision

max298 avatar Nov 08 '23 08:11 max298

Going for the official node library seems like the best option for long-term sustainability and rapid adoption of new features

PaarthShah avatar Nov 09 '23 10:11 PaarthShah

Yes but we would then be responsible for handling context, which is fine if someone is willing to write the code.

bertybuttface avatar Nov 09 '23 18:11 bertybuttface