guidance icon indicating copy to clipboard operation
guidance copied to clipboard

image raises subscript exception

Open robmck-ms opened this issue 1 year ago • 3 comments

The bug Using the multi-modal code from the README results in TypeError: 'GoogleAIChatEngine' object is not subscriptable:

<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
Traceback (most recent call last):
  File "/Users/robmck/git/me/ai-experiments/design-critique/guidance-image-test.py", line 19, in <module>
    lm += gen("answer")
  File "/Users/robmck/git/me/ai-experiments/guidance/guidance/models/_model.py", line 1159, in __add__
    out = lm._run_stateless(value)
  File "/Users/robmck/git/me/ai-experiments/guidance/guidance/models/_model.py", line 1364, in _run_stateless
    for chunk in gen_obj:
  File "/Users/robmck/git/me/ai-experiments/guidance/guidance/models/_model.py", line 760, in __call__
    logits = self.get_logits(token_ids, forced_bytes, current_temp)
  File "/Users/robmck/git/me/ai-experiments/guidance/guidance/models/_grammarless.py", line 338, in get_logits
    raise new_bytes
  File "/Users/robmck/git/me/ai-experiments/guidance/guidance/models/_grammarless.py", line 165, in _start_generator_stream
    for chunk in generator:
  File "/Users/robmck/git/me/ai-experiments/guidance/guidance/models/_googleai.py", line 211, in _start_generator
    mime_type="image/jpeg", data=self[raw_parts[i + 1]]
TypeError: 'GoogleAIChatEngine' object is not subscriptable

Looking through the code, image() saves the binary image to Model._variables. GoogleAIChatEngine seems to expect that data to be in self[image_id], but GoogleAIChatEngine nor any of its parent classes has a getitem(). Perhaps it was written in an earlier factoring in which the engine and model objects were one and the same?

To Reproduce

from guidance import (
    models,
    user,
    assistant,
    gen,
    image,
)
import os

google_key = os.environ.get("GEMINI_API_KEY")
gemini = models.GoogleAIChat(
    "gemini-pro-vision",
    api_key=google_key,
)
with user():
    lm = gemini + "What is this a picture of?" + image("chairs.jpg")

with assistant():
    lm += gen("answer")

System info (please complete the following information):

  • Macos
  • Guidance Version 0.1.15:

robmck-ms avatar May 28 '24 22:05 robmck-ms

Yes I think our image support is broken right now. You're exactly right that it was written when models and engines were identical :). @nking-1 is looking into how to best re-enable this and also bring support for image to more models.

Harsha-Nori avatar May 28 '24 22:05 Harsha-Nori

Thanks!

In the meantime, I hacked it to work by plumbing Models._variables down through to the _generator function of images (and added OpenAI support too via that hack). Not very elegant, but it unblocks me for now to play with images.

Here it is if you're curious: https://github.com/robmck-ms/guidance/tree/hack_image_support

robmck-ms avatar May 29 '24 19:05 robmck-ms

Any update on this?

kklemon avatar Jun 25 '24 22:06 kklemon