unsaged icon indicating copy to clipboard operation
unsaged copied to clipboard

[Feature] Add GPT Vision Models

Open sebiweise opened this issue 6 months ago • 1 comments

Describe the bug Add GPT 4 Vision https://platform.openai.com/docs/guides/vision

sebiweise avatar Dec 19 '23 19:12 sebiweise

Any opinions on what this feature should look like?

I imagine that we agree that if the gpt-4-vision model is selected, we show an "upload images" icon on the left in the message bar.

But beyond that?

The API accepts both image URLs and base64 encoded images. Should we present this choice to the user?

The API permits that the image's detail be set to low/high/auto. Should we present this choice to the user?

If the user uploads an image (rather than provides a URL), should we use supabase storage (S3-equivalent) to hold onto it? The gpt-4-vision docs recommend ...

For long running conversations, we suggest passing images via URL's instead of base64. The latency of the model can also be improved by downsizing your images ahead of time to be less than the maximum size they are expected them to be. For low res mode, we expect a 512px x 512px image. For high res mode, the short side of the image should be less than 768px and the long side should be less than 2,000px.

Just glancing at supabase storage, it looks like it could offer both URLs and resizing, which would be advantageous in long running conversations.

johnnymo87 avatar Jan 22 '24 14:01 johnnymo87