OpenAI models does not support multi modal?

Open MiladInk opened this issue 1 year ago • 3 comments

I tried the multimodal code example with gpt-40 and the output has no relation to the provided image while the code does not raise any error.

Jul 11 '24 00:07 MiladInk

Hi @MiladInk, yes multimodal inputs are currently disabled, but we're in the process of revamping our multimodal support (including images) in the near future! We'll update this issue when we have images (and other input modalities) working on OpenAI and other providers again :).

@nking-1

Jul 21 '24 06:07 Harsha-Nori

Hi @Harsha-Nori . Please, can you tell if this is effectively abandoned, and if yes, what was it abandoned in favor of? Is there another library that can do it better?

Nov 15 '24 03:11 dchichkov

I've implemented a workaround that enables image support when using vLLM/OpenAI inference - https://github.com/guidance-ai/guidance/issues/1077

Nov 15 '24 22:11 dchichkov