haystack-core-integrations icon indicating copy to clipboard operation
haystack-core-integrations copied to clipboard

Support prompt caching in Google Vertex AI generators

Open julian-risch opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe. Google Vertex AI, in particular the models Gemini 1.5 Flash and Gemini 1.5 Pro support prompt caching or context caching. We should enable users to use that feature through Haystack to reduce costs and latency. https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview

Describe the solution you'd like We need to implement a way to first create a context cache and then to reference the contents of the context cache in a prompt request.

julian-risch avatar Aug 19 '24 06:08 julian-risch