flutter-sound-stream
flutter-sound-stream copied to clipboard
voice is really bad when streaming (ish)
Hello,
I am trying to get tis plugin work with openAI speech API. Please note I got the same working with my native android app so I guess I know what I am doing.
Test1
_player.initialize(sampleRate: 24000);
await _player.initialize();
await _player.start();
Stream<List<int>> resp = await EncodeDecode.getSpeechStream("key",
"United States, country in North America, a federal republic of 50 states. Besides the 48 conterminous states that occupy the middle latitudes of the continent, the United States includes the state of Alaska, at the northwestern extreme of North America, and the island state of Hawaii, in the mid-Pacific Ocean. The conterminous states are bounded on the north by Canada, on the east by the Atlantic Ocean, on the south by the Gulf of Mexico and Mexico, and on the west by the Pacific Ocean. The United States is the fourth largest country in the world in area (after Russia, Canada, and China). The national capital is Washington, which is coextensive with the District of Columbia, the federal capital region created in 1790.",
true
);
List<int> buffer = [];
StreamSubscription<List<int>> subscription = resp.listen(
(List<int> chunk) async{
buffer.addAll(chunk);
if(buffer.length > 5000){
await _player.writeChunk(Uint8List.fromList(chunk));
buffer = [];
}
},
onDone: (){
debugPrint("done");
},
onError: (Error err,StackTrace){
debugPrint("error");
}
);
result - no voice, just noise. You can hardly recognize some words. Please also note on Android native (java) I had to similar issue - buffer writtien needs to be at least the min buffer size : int minBuffer = AudioTrack.getMinBufferSize( sampleRateHz, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT );
but this didn't helped.
Test2
I assume the problem is List
_player.initialize();
await _player.initialize();
await _player.start();
Uint8List resp = await EncodeDecode.getSpeechStream2("key",
"United States, country in North America, a federal republic of 50 states. Besides the 48 conterminous states that occupy the middle latitudes of the continent, the United States includes the state of Alaska, at the northwestern extreme of North America, and the island state of Hawaii, in the mid-Pacific Ocean. The conterminous states are bounded on the north by Canada, on the east by the Atlantic Ocean, on the south by the Gulf of Mexico and Mexico, and on the west by the Pacific Ocean. The United States is the fourth largest country in the world in area (after Russia, Canada, and China). The national capital is Washington, which is coextensive with the District of Columbia, the federal capital region created in 1790.",
true
);
await _player.writeChunk(resp);
}
-> read all before wrtting - this worked but voice quality is horible!
I did checked your android code and I can see nothing wrong.
Any ideas?
Best regards: V
I Had the same problem, @alnitak figured that the chunks coming from OpenAI are too small to be played smoothly, the best option is to buffer the chunks before playing them.
/// Since the chunks size coming from OpenAI could be really small and they
/// can be odd, here we are using a buffer. When the buffer reaches the
/// [chunkSize] size, we yield the bytes so we are sure that we deliver
/// an even number of bytes of a consistent size.
final buffer = BytesBuilder();
var remainder = Uint8List(0);
const chunkSize = 1024 * 2; // 2 KB of audio data
var count = 0;
// Read and yield chunks from the response stream
await for (final chunk in response.stream) {
buffer.add(chunk);
count++;
debugPrint('YIELD count: $count buffer: ${buffer.length} bytes');
while (buffer.length >= chunkSize) {
final bufferBytes = buffer.toBytes();
final chunk = Uint8List.sublistView(bufferBytes, 0, chunkSize);
debugPrint('Chunk: ${chunk.length} bytes');
yield chunk;
remainder = Uint8List.sublistView(bufferBytes, chunkSize);
buffer
..clear()
..add(remainder);
}
}
if (remainder.isNotEmpty) yield remainder;
}
I Had the same problem, @alnitak figured that the chunks coming from OpenAI are too small to be played smoothly, the best option is to buffer the chunks before playing them.
/// Since the chunks size coming from OpenAI could be really small and they /// can be odd, here we are using a buffer. When the buffer reaches the /// [chunkSize] size, we yield the bytes so we are sure that we deliver /// an even number of bytes of a consistent size. final buffer = BytesBuilder(); var remainder = Uint8List(0); const chunkSize = 1024 * 2; // 2 KB of audio data var count = 0; // Read and yield chunks from the response stream await for (final chunk in response.stream) { buffer.add(chunk); count++; debugPrint('YIELD count: $count buffer: ${buffer.length} bytes'); while (buffer.length >= chunkSize) { final bufferBytes = buffer.toBytes(); final chunk = Uint8List.sublistView(bufferBytes, 0, chunkSize); debugPrint('Chunk: ${chunk.length} bytes'); yield chunk; remainder = Uint8List.sublistView(bufferBytes, chunkSize); buffer ..clear() ..add(remainder); } } if (remainder.isNotEmpty) yield remainder; }
thank you so much. This answer really helped me a lot.