go-genai icon indicating copy to clipboard operation
go-genai copied to clipboard

feat: Implement core real-time interaction capabilities for Gemini API

Open vizvasrj opened this issue 11 months ago • 2 comments

This commit introduces the fundamental features for real-time communication with the Gemini API, encompassing:

  • Real-time audio streaming: Enables bidirectional audio interaction using microphone input and API responses.
  • Real-time video streaming (camera and screen capture): Allows streaming video from either a webcam or the screen, configurable via the MODE environment variable. Includes image resizing and encoding for efficient transmission.
  • Text-based prompt functionality: Implements the ability to send text prompts to the Gemini API via the command line.
  • Graceful shutdown mechanism: Ensures proper cleanup of audio and video resources, including closing streams and websocket connections, upon application exit.
  • Initial README documentation: Provides a comprehensive guide on project setup, configuration, and usage examples for various modes.

vizvasrj avatar Dec 28 '24 15:12 vizvasrj

any one going to review his?

vizvasrj avatar Jan 05 '25 08:01 vizvasrj

Thank you for your contribution. This repository is not yet open for external contributions. We will update once we have established a process for accepting pull requests.

qiaodev avatar Jan 05 '25 18:01 qiaodev