generative-ai-swift icon indicating copy to clipboard operation
generative-ai-swift copied to clipboard

Support prompting with media files

Open longseespace opened this issue 1 year ago • 1 comments

Description of the feature request:

The Gemini API supports uploading media files separately from the prompt input, allowing your media to be reused across multiple requests and multiple prompts.

https://ai.google.dev/gemini-api/docs/prompting_with_media?lang=python https://ai.google.dev/api/files

What problem are you trying to solve with this feature?

Add the ability to prompt a document from a client

Any other information you'd like to share?

No response

longseespace avatar Aug 09 '24 03:08 longseespace

Hi @longseespace, it's possible to use media files that have already been uploaded with the server-side SDKs (Python, Go, Node.js) or REST APIs using fileData in the Swift SDK, e.g.:

let content = try await model.generateContent(
  ModelContent.Part.fileData(
    mimetype: "image/jpeg",
    uri: "https://generativelanguage.googleapis.com/v1beta/files/some-hash"
  ),
  "What is in this image?"
)

Unfortunately, based on our current engineering plan and product backlog, there is no plan to support uploading files using the Swift SDK in the near term. As a potential alternative, the similar product Vertex AI for Firebase SDK supports media uploaded with the Cloud Storage for Firebase SDK. This guide shows how to use the two SDKs together: https://firebase.google.com/docs/vertex-ai/solutions/cloud-storage

andrewheard avatar Aug 09 '24 22:08 andrewheard

Closing. If still requested, please create a new issue at https://github.com/firebase/firebase-ios-sdk/issues for Firebase AI Logic.

paulb777 avatar May 28 '25 23:05 paulb777