SDL icon indicating copy to clipboard operation
SDL copied to clipboard

Unexpected Audio Format Change After Opening Default Recording Device in SDL

Open osenberg-x opened this issue 7 months ago • 2 comments

  1. Before opening SDL_AUDIO_DEVICE_DEFAULT_RECORDING, calling SDL_GetAudioDeviceFormat returns the default microphone format as SDL_AUDIO_S16LE.
  2. After executing: stream_in_ = SDL_OpenAudioDeviceStream(SDL_AUDIO_DEVICE_DEFAULT_RECORDING, nullptr, nullptr, nullptr); the microphone format changes to SDL_AUDIO_F32LE.
  3. Forcing the stream to use SDL_AUDIO_S16LE causes a crash at SDL_audiotypecvt.c:427 during SDL_GetAudioStreamData.

‌Reproduction Steps:

 auto id_ = SDL_AUDIO_DEVICE_DEFAULT_RECORDING;
 SDL_AudioSpec spec;
 SDL_GetAudioDeviceFormat(id_, &spec, nullptr);
 log_debug("list device freq: {}, channels: {}, format: {}", spec.freq,
           spec.channels, (int)spec.format);;

// Step 2: Open stream (format changes to SDL_AUDIO_F32LE)
stream_spec_.freq = 48000;
stream_spec_.channels = 2;
stream_spec_.format = SDL_AUDIO_S16LE;
stream_in_ = SDL_OpenAudioDeviceStream(id_, &stream_spec_, nullptr, nullptr);
SDL_AudioStream* stream_in_ = SDL_OpenAudioDeviceStream(SDL_AUDIO_DEVICE_DEFAULT_RECORDING, &stream_spec_, nullptr, nullptr);

SDL_AudioSpec dev_spec_;
SDL_GetAudioDeviceFormat(id_, &dev_spec_, NULL);
log_debug("device freq: {}, channels: {}, format: {}", dev_spec_.freq,
           dev_spec_.channels, (int)dev_spec_.format);

SDL_AudioSpec stream_spec_;
SDL_GetAudioStreamFormat(stream_in_, &stream_spec_, nullptr);
log_debug("origin stream freq: {}, channels: {}, format: {}, sample frame: {}", stream_spec_.freq, stream_spec_.channels,
            (int)stream_spec_.format, sample_frame);

// Step 3: Force S16LE and read data → CRASH
SDL_GetAudioStreamData(stream_in_, buffer_, buffer_size);

‌Environment: SDL Version: 3.2.10 OS: Windows11

osenberg-x avatar Apr 28 '25 02:04 osenberg-x

(Info dump incoming...)

Okay, so the way this works on Windows, is we find all the audio devices at startup, and report the format the hardware claims to want...but then WASAPI comes along, when opening the device, and says "I want everything in float32 format" because the device is is in shared mode.

We get told "give me float data" by IAudioClient::GetMixFormat, which says this in the docs:

The mix format is the format that the audio engine uses internally for digital processing of shared-mode streams. This format is not necessarily a format that the audio endpoint device supports. Thus, the caller might not succeed in creating an exclusive-mode stream with a format obtained by calling GetMixFormat.

For example, to facilitate digital audio processing, the audio engine might use a mix format that represents samples as floating-point values. If the device supports only integer PCM samples, then the engine converts the samples to or from integer PCM values at the connection between the device and the engine.

And since the format might be different, we take this moment during device open to update SDL with the new information.

This isn't ideal, to be sure. There are two ways to resolve this:

  • When using WASAPI, we could mark every device as float32 and assume this will be correct (although maybe not if it decides to make other changes during device open! But it's probably more correct than now).
  • Open the device in exclusive mode, which will likely give you the same format as we originally reported (I assume), and a little lower latency, too, but it will make all your background audio--including the music player you're running in the background during the game, or the YouTube video you're listening to during a tedious part of a game--stop working.

As for the crash: it shouldn't be crashing in these cases, even though the format changes. So in this code:

// Step 3: Force S16LE and read data → CRASH
SDL_GetAudioStreamData(stream_in_, buffer_, buffer_size);

...what does "force S16LE" look like? Where is the crash happening?

You should absolutely be able to set the output of a recording audiostream to whatever you want and get data in that format, whether the device decided to feed float32 or Sint16 into the other end of the stream.

icculus avatar Apr 30 '25 20:04 icculus

(Info dump incoming...)

Okay, so the way this works on Windows, is we find all the audio devices at startup, and report the format the hardware claims to want...but then WASAPI comes along, when opening the device, and says "I want everything in float32 format" because the device is is in shared mode.

We get told "give me float data" by IAudioClient::GetMixFormat, which says this in the docs:

The mix format is the format that the audio engine uses internally for digital processing of shared-mode streams. This format is not necessarily a format that the audio endpoint device supports. Thus, the caller might not succeed in creating an exclusive-mode stream with a format obtained by calling GetMixFormat. For example, to facilitate digital audio processing, the audio engine might use a mix format that represents samples as floating-point values. If the device supports only integer PCM samples, then the engine converts the samples to or from integer PCM values at the connection between the device and the engine.

And since the format might be different, we take this moment during device open to update SDL with the new information.

This isn't ideal, to be sure. There are two ways to resolve this:

  • When using WASAPI, we could mark every device as float32 and assume this will be correct (although maybe not if it decides to make other changes during device open! But it's probably more correct than now).
  • Open the device in exclusive mode, which will likely give you the same format as we originally reported (I assume), and a little lower latency, too, but it will make all your background audio--including the music player you're running in the background during the game, or the YouTube video you're listening to during a tedious part of a game--stop working.

As for the crash: it shouldn't be crashing in these cases, even though the format changes. So in this code:

// Step 3: Force S16LE and read data → CRASH SDL_GetAudioStreamData(stream_in_, buffer_, buffer_size); ...what does "force S16LE" look like? Where is the crash happening?

You should absolutely be able to set the output of a recording audiostream to whatever you want and get data in that format, whether the device decided to feed float32 or Sint16 into the other end of the stream.

“...what does "force S16LE" look like? Where is the crash happening?”
Thank you very much for your reply.

  1. In the second step of the code example, when opening the device using SDL_OpenAudioDeviceStream, I specified sint16: stream_spec_.freq = 48000; stream_spec_.channels = 2; stream_spec_.format = SDL_AUDIO_S16LE; SDL_AudioStream* stream_in_ = SDL_OpenAudioDeviceStream(SDL_AUDIO_DEVICE_DEFAULT_RECORDING, &stream_spec_, nullptr, nullptr);
  2. The crash occurs at SDL_audiotypecvt.c:427 during SDL_GetAudioStreamData.

osenberg-x avatar May 06 '25 08:05 osenberg-x

In SDL 3.2.18, the WASAPI backend will report devices with the correct (float32) format right from the start.

icculus avatar Jul 11 '25 20:07 icculus

I haven't been able to reproduce a crash here, but I'll move this to 3.4.0 for more consideration.

icculus avatar Jul 11 '25 20:07 icculus

Closing this one, but please open a new bug (with steps to reproduce/simple example program that triggers it, please!) if this is still happening.

icculus avatar Aug 27 '25 00:08 icculus