openai postAndExpectFileResponse with chunk support for real time audio streaming

Hi,

I'd like to use OpenAI's TTS with chunks so we can start playing the speech before the full audio file has been created.

Any information if this will be implemented in this dart library? If not, I was hoping to get some pointers in how to implement it in my own app.

My current approach is to feed the TTS sentence by sentence, but each TTS sentence has about 500 to 1000 ms of silence at the end. So if you chain these audio recordings it sounds really unnatural. The application is quite time sensitive.

Edit: link to OpenAI docs

Thanks, Harmen

Nov 26 '23 23:11 PeperMarkreel

Thank you for pointing about this, will check and go back to let you know.

Feb 21 '24 23:02 anasfik

since this seems playing audio related, the most that I can do is offer a Stream<List<int>> of the speech file instead of a file, I did make this simple flutter app that have raw code to play a speech in real time as a stream:

import 'dart:async';
import 'dart:convert';

import 'package:flutter/material.dart';
import 'package:just_audio/just_audio.dart';
import 'package:http/http.dart' as http;

final StreamController<List<int>> _controller = StreamController<List<int>>();

void main() {
  WidgetsFlutterBinding.ensureInitialized();

  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  MyApp({Key? key}) : super(key: key);
  final player = AudioPlayer();

  MyStreamAudioSource myStreamSource = MyStreamAudioSource();
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Flutter Demo',
      theme: ThemeData(
        primarySwatch: Colors.blue,
      ),
      home: Scaffold(
          body: Center(
        child: ElevatedButton(
          onPressed: () async {
            _listenToStream();

            player.setAudioSource(myStreamSource);
            player.play();
          },
          child: const Text('Press Me'),
        ),
      )),
    );
  }

  void _listenToStream() async {
    try {
      // fetch an audio as stream with http.
      // then add the data to the stream

      final uri = Uri.parse("https://api.openai.com/v1/audio/speech");

      final headers = {
        "Authorization":
            "Bearer YOUR-KEY",
        "Content-Type": "application/json"
      };

      final req = http.Request("POST", uri);

      req.headers.addAll(headers);

      req.body = jsonEncode({
        "model": "tts-1",
        "input": "Hi, I am a somebody. I am testing the audio stream.",
        "voice": "echo"
      });

      final res = await req.send();

      res.stream.listen((List<int> chunk) {
        myStreamSource.addAudioData(chunk);
      });
    } catch (e) {
      rethrow;
    }
  }
}

class MyStreamAudioSource extends StreamAudioSource {
  void addAudioData(List<int> data) {
    _controller.add(data);
  }

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    return StreamAudioResponse(
      sourceLength: null,
      contentLength: null,
      offset: start ?? 0,
      stream: _controller.stream,
      contentType: 'audio/mpeg',
    );
  }

  @override
  Future<void> close() async {
    await _controller.close();
  }
}

Note: if you are trying to run this Flutter code, configure the just_audio package for the platform you are running the app on, also set your API key in the Authorization header.

This needs many changes, but as a demo, it should reflect what you want to achieve.

I am thinking of exposing a method called createSpeechBytes, that will return a Stream<List<int>> which you can pipe to an audio player in your flutter app.

Feb 22 '24 00:02 anasfik

"Thank you, @anasfik . Do you have any timeline for when we can expect this feature to be live in the package?"

Mar 06 '24 13:03 decisionslab2

openai openai copied to clipboard

postAndExpectFileResponse with chunk support for real time audio streaming

openai
openai copied to clipboard