deepgram-rust-sdk icon indicating copy to clipboard operation
deepgram-rust-sdk copied to clipboard

Feat/flux support

Open dpmishler opened this issue 1 month ago • 0 comments

Add Flux Conversational Speech Recognition Support

This PR adds support for Deepgram's Flux model (flux-general-en), the first conversational speech recognition model built specifically for voice agents.

Features Added

  • New flux_request() and flux_request_with_options() methods for Flux streaming
  • Support for turn-based conversation detection with FluxResponse types
  • Configurable end-of-turn detection parameters:
    • eot_threshold - Confidence required for EndOfTurn events
    • eager_eot_threshold - Confidence for early EagerEndOfTurn events
    • eot_timeout_ms - Maximum silence before forcing turn end
  • New TurnEvent enum with variants: StartOfTurn, EndOfTurn, EagerEndOfTurn, TurnResumed, Update
  • Uses /v2/listen endpoint for Flux API

Examples

  • simple_flux - File streaming example
  • microphone_flux - Real-time microphone streaming example

Implementation Details

  • Follows existing websocket.rs patterns for consistency
  • Comprehensive error handling and edge case coverage
  • Partial frame handling for fragmented JSON messages
  • Proper connection state tracking and cleanup
  • Tests for URL construction and query parameter encoding

Documentation

  • Updated CHANGELOG.md with Flux feature details
  • Updated examples/README.md with Flux examples
  • Inline documentation with links to Deepgram Flux API Reference

Testing

  • All existing tests pass (131 tests)
  • New tests for Flux URL construction and query encoding
  • Verified with real API using both file and microphone examples

dpmishler avatar Nov 08 '25 07:11 dpmishler