haystack-core-integrations
haystack-core-integrations copied to clipboard
Return usage metadata in Vertex
Is your feature request related to a problem? Please describe.
Unlike the VertexAIGeminiChatGenerator, the VertexAIGeminiGenerator class does not return any usage metadata (prompt tokens, etc) in the response. It throws away all of the metadata produced by the underlying Google SDK library and returns just the text.
Describe the solution you'd like
Update VertexAIGeminiGenerator to return usage meta data.
Describe alternatives you've considered
A custom generator that extends the VertexAIGeminiGenerator and overrides run(), and _get_response() works but would like it if the base class supported this out of the box.
Hello @antrix , Thank you for opening this issue. I would like to better understand if you see any reason against using the VertexAIGeminiChatGenerator for your use case? If you use str as input to your VertexAIGeminiGenerator you could just wrap it into a ChatMessage as in this example. Or switch from PromptBuilder to ChatPromptBuilder in a pipeline. Here is a documentation page explaining the differences between Generator and ChatGenerator. In general, we recommend using a ChatGenerator.
Hi @julian-risch. It's a good question! My use case is multi-modal. I'm passing binary content (PDF bytestream to be specific) and getting text back.
Here's roughly what I am doing now:
pdf_stream = haystack.dataclasses.byte_stream.ByteStream(data=pdf_bytes, mime_type="application/pdf")
parts = [pdf_stream, system_prompt]
result = generator.run(parts=parts)
As far as I can tell, the ChatGenerator relies on ChatMessage instances. The ChatMessage.from_user(...) only accepts str as the text. Didn't find a way to provide binary input.
Is there a way to do multi-modal input/output with ChatGenerators ?