DPP Audio artifacts when streaming uncompressed audio w/ send_audio

trafficstars

Running tip, 45c509fb763f0b6d34494d546c802986ac15d648.

I'm trying to stream uncompressed PCM audio into DPP in this manner:

const int kAudioBufferCount = 4; // length of the audio buffer chain in DPP

bot.on_voice_buffer_send([&](const dpp::voice_buffer_send_t& event) {
	int buffer_size = event.buffer_size;

	while (buffer_size < kAudioBufferCount) {
		// Get exactly 11520 bytes of 48khz, stereo, signed 16-bit PCM
		PCM block = PopAudio();

		event.voice_client->send_audio_raw((uint16_t*)block.data, block.size);

		++buffer_size;
	}
});

This nearly works - I hear the streamed sound from Discord Web - but there are frequent glitches.

I added some code to measure the rate at which I'm passing samples into DPP and measuring ~47364 Hz, not 48k. I'm definitely not underflowing -- event.buffer_size doesn't drop below 3.

The artifacts sound like skipped chunks waveform to me, not resampling.

Any idea why DPP would be requesting at less than the full rate? Or is the above call pattern incorrect for a streaming use case?

Jun 29 '22 00:06 jritts

don't send audio inside the on buffer send, it is going to be near impossible to get the timing right. calling send audio raw does opus encoding and sodium encryption of the block, doing this continually with really small amounts of audio will cause artifacts. instead encode larger amounts, sending bigger buffers where possible and let the library stream it.

Jun 29 '22 01:06 braindigitalis

The loop at the top of discord_voice_client::send_audio_raw recursively calls itself, breaking up longer buffers into 11520 chunks. I'm already sending the maximum 11520 bytes per call to avoid the extra buffering and possible silence insertion on line 1102.

Given the output is buffered and not underflowing, why would there be timing considerations?

const size_t max_frame_bytes = 11520;
if (length > max_frame_bytes) {
	std::string s_audio_data((const char*)audio_data, length);
	while (s_audio_data.length() > max_frame_bytes) {
		std::string packet(s_audio_data.substr(0, max_frame_bytes));
		s_audio_data.erase(s_audio_data.begin(), s_audio_data.begin() + max_frame_bytes);
		if (packet.size() < max_frame_bytes) {
			packet.resize(max_frame_bytes, 0);
		}
		send_audio_raw((uint16_t*)packet.data(), max_frame_bytes);
	}

	return *this;
}

Jun 29 '22 04:06 jritts

Never mind, I read through the implementation and see what you mean by timing.

I tried calling send_audio_raw from my streaming audio input thread, and I'm getting a repeatable crash in discord_voice_client::encode(). Attached the stack.
Next I tried encoding to Opus myself and passing it to send_audio_opus, and it finally appears to be working, no glitches out of discord web.

crash_discord

Jun 29 '22 07:06 jritts

youre writing inside the write event, causing a recursion. the error is a stack overflow, its the way your code is written.

Jun 29 '22 18:06 braindigitalis

No, as I said, now I am calling send_audio_raw from my streaming audio input thread, not a DPP callback. There are no DPP frames above my call to send_audio_raw. As you can see in the screenshot, the overflow is happening within the call to opus_encode and it's not a recursion between DPP and client code.

Jun 29 '22 18:06 jritts

This issue has had no activity and is being marked as stale. If you still wish to continue with this issue please comment to reopen it.

Aug 29 '22 03:08 github-actions[bot]

DPP
DPP copied to clipboard

Audio artifacts when streaming uncompressed audio w/ send_audio_raw

DPP DPP copied to clipboard

Audio artifacts when streaming uncompressed audio w/ send_audio_raw

DPP
DPP copied to clipboard