flutter-sound-stream icon indicating copy to clipboard operation
flutter-sound-stream copied to clipboard

voice is really bad when streaming (ish)

Open letisoft opened this issue 11 months ago • 1 comments

Hello,

I am trying to get tis plugin work with openAI speech API. Please note I got the same working with my native android app so I guess I know what I am doing.

Test1

  _player.initialize(sampleRate: 24000);
  await  _player.initialize();
  await _player.start();

  Stream<List<int>> resp = await EncodeDecode.getSpeechStream("key",
      "United States, country in North America, a federal republic of 50 states. Besides the 48 conterminous states that occupy the middle latitudes of the continent, the United States includes the state of Alaska, at the northwestern extreme of North America, and the island state of Hawaii, in the mid-Pacific Ocean. The conterminous states are bounded on the north by Canada, on the east by the Atlantic Ocean, on the south by the Gulf of Mexico and Mexico, and on the west by the Pacific Ocean. The United States is the fourth largest country in the world in area (after Russia, Canada, and China). The national capital is Washington, which is coextensive with the District of Columbia, the federal capital region created in 1790.",
    true
  );

  List<int> buffer = [];
  StreamSubscription<List<int>> subscription = resp.listen(
          (List<int> chunk) async{
        buffer.addAll(chunk);
        if(buffer.length > 5000){
          await _player.writeChunk(Uint8List.fromList(chunk));
          buffer = [];
        }
      },
      onDone: (){
        debugPrint("done");
      },
      onError: (Error err,StackTrace){
        debugPrint("error");
      }
  );

result - no voice, just noise. You can hardly recognize some words. Please also note on Android native (java) I had to similar issue - buffer writtien needs to be at least the min buffer size : int minBuffer = AudioTrack.getMinBufferSize( sampleRateHz, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT );

but this didn't helped.

Test2

I assume the problem is List to Uint8List conversion so:

  _player.initialize();
  await  _player.initialize();
  await _player.start();

  Uint8List resp = await EncodeDecode.getSpeechStream2("key",
      "United States, country in North America, a federal republic of 50 states. Besides the 48 conterminous states that occupy the middle latitudes of the continent, the United States includes the state of Alaska, at the northwestern extreme of North America, and the island state of Hawaii, in the mid-Pacific Ocean. The conterminous states are bounded on the north by Canada, on the east by the Atlantic Ocean, on the south by the Gulf of Mexico and Mexico, and on the west by the Pacific Ocean. The United States is the fourth largest country in the world in area (after Russia, Canada, and China). The national capital is Washington, which is coextensive with the District of Columbia, the federal capital region created in 1790.",
      true
  );

  await _player.writeChunk(resp);
}

-> read all before wrtting - this worked but voice quality is horible!

I did checked your android code and I can see nothing wrong.

Any ideas?

Best regards: V

letisoft avatar Dec 12 '24 20:12 letisoft

I Had the same problem, @alnitak figured that the chunks coming from OpenAI are too small to be played smoothly, the best option is to buffer the chunks before playing them.

prototype

    /// Since the chunks size coming from OpenAI could be really small and they
    /// can be odd, here we are using a buffer. When the buffer reaches the
    /// [chunkSize] size, we yield the bytes so we are sure that we deliver
    /// an even number of bytes of a consistent size.
    final buffer = BytesBuilder();
    var remainder = Uint8List(0);
    const chunkSize = 1024 * 2; // 2 KB of audio data
    var count = 0;
    // Read and yield chunks from the response stream
    await for (final chunk in response.stream) {
      buffer.add(chunk);
      count++;
      debugPrint('YIELD count: $count  buffer: ${buffer.length} bytes');

      while (buffer.length >= chunkSize) {
        final bufferBytes = buffer.toBytes();
        final chunk = Uint8List.sublistView(bufferBytes, 0, chunkSize);
        debugPrint('Chunk: ${chunk.length} bytes');
        yield chunk;

        remainder = Uint8List.sublistView(bufferBytes, chunkSize);
        buffer
          ..clear()
          ..add(remainder);
      }
    }
    if (remainder.isNotEmpty) yield remainder;
  }

callmephil avatar Dec 18 '24 20:12 callmephil

I Had the same problem, @alnitak figured that the chunks coming from OpenAI are too small to be played smoothly, the best option is to buffer the chunks before playing them.

prototype

    /// Since the chunks size coming from OpenAI could be really small and they
    /// can be odd, here we are using a buffer. When the buffer reaches the
    /// [chunkSize] size, we yield the bytes so we are sure that we deliver
    /// an even number of bytes of a consistent size.
    final buffer = BytesBuilder();
    var remainder = Uint8List(0);
    const chunkSize = 1024 * 2; // 2 KB of audio data
    var count = 0;
    // Read and yield chunks from the response stream
    await for (final chunk in response.stream) {
      buffer.add(chunk);
      count++;
      debugPrint('YIELD count: $count  buffer: ${buffer.length} bytes');

      while (buffer.length >= chunkSize) {
        final bufferBytes = buffer.toBytes();
        final chunk = Uint8List.sublistView(bufferBytes, 0, chunkSize);
        debugPrint('Chunk: ${chunk.length} bytes');
        yield chunk;

        remainder = Uint8List.sublistView(bufferBytes, chunkSize);
        buffer
          ..clear()
          ..add(remainder);
      }
    }
    if (remainder.isNotEmpty) yield remainder;
  }

thank you so much. This answer really helped me a lot.

erkamkavak avatar Apr 10 '25 20:04 erkamkavak