firebase-ios-sdk icon indicating copy to clipboard operation
firebase-ios-sdk copied to clipboard

[FR]: Add Vertex AI Vision & Audio Sample Code for iOS

Open 1998code opened this issue 1 year ago • 3 comments

Description

@andrewheard

  1. As we upgraded to 1.5 Flash (https://github.com/firebase/firebase-ios-sdk/pull/12979), is it possible to achieve like Project Astra now?
  2. Currently, the sample only provided text output.
  3. The AI can do conversation with audio and vision.

Thank you :)

API Proposal

N/A

Firebase Product(s)

Vertex AI

1998code avatar May 21 '24 12:05 1998code

I couldn't figure out how to label this issue, so I've labeled it for a human to triage. Hang tight.

google-oss-bot avatar May 21 '24 12:05 google-oss-bot

Thanks for the feature request, @1998code. It would currently be possible to add video and audio input to the sample apps but currently only text output is supported by the API.

I think we'd probably want to add this feature to the multi-modal sample but would need to refactor it a bit since it stores a list of PhotosPickerItem.

andrewheard avatar May 21 '24 17:05 andrewheard

Awesome! Looking forward :)

1998code avatar May 30 '24 15:05 1998code

  1. is it possible to achieve like Project Astra now?

Note: This is now referred to publicly as the Multimodal Live API but it is not yet supported by the SDKs.

  1. Currently, the sample only provided text output.

@1998code, PR #14545 adds image generation to the sample (using Imagen, not Gemini, though).

andrewheard avatar Mar 07 '25 20:03 andrewheard

Cool! Thanks for the update 🤩

1998code avatar Mar 07 '25 20:03 1998code