Sound is distorted on WASAPI only
On Windows 11, using wasapi we observed that sounds play back with some distortion.
Spectrogram comparison made by my colleague: Top image is playback with dsound, bottom is with wasapi. You can see some clear dark bands in the bottom image which I think means frequencies getting removed.
Backend was changed using alsoft.ini. On my machine the sound is correct with all backends except wasapi.
Console log in with ALSOFT_LOGLEVEL=3:
[ALSOFT] (II) Initializing library v1.24.3-dc7d7054 master [ALSOFT] (II) Supported backends: wasapi, dsound, winmm, jack, null, wave [ALSOFT] (II) Loading config C:\Users\Dave\AppData\Roaming\alsoft.ini... [ALSOFT] (II) setting 'drivers' = 'wasapi,winmm,jack,dsound,wave,null' [ALSOFT] (II) setting 'decoder/hq-mode' = 'true' [ALSOFT] (II) Got binary: C:\myth2, Myth II 1.8.5.exe [ALSOFT] (II) Loading config C:\myth2\alsoft.ini... [ALSOFT] (II) Vendor ID: "AuthenticAMD" [ALSOFT] (II) Name: "AMD Ryzen 7 PRO 5875U with Radeon Graphics" [ALSOFT] (II) Extensions: +SSE +SSE2 +SSE3 +SSE4.1 [ALSOFT] (II) Found option drivers = "wasapi,winmm,jack,dsound,wave,null" [ALSOFT] (II) Starting watcher thread [ALSOFT] (II) Initialized backend "wasapi" [ALSOFT] (II) Added "wasapi" for playback [ALSOFT] (II) Added "wasapi" for capture [ALSOFT] (II) Got device "Speakers (Realtek(R) Audio)", "{1521687A-A2FA-435F-8636-13820691DFC5}", "{0.0.0.00000000}.{1521687a-a2fa-435f-8636-13820691dfc5}" [ALSOFT] (II) Got device "1 - DELL S2722QC (AMD High Definition Audio Device)", "{4D442B3F-F140-4E56-979C-0EF9A23D4E95}", "{0.0.0.00000000}.{4d442b3f-f140-4e56-979c-0ef9a23d4e95}" [ALSOFT] (II) Got device "Realtek HD Audio 2nd output (Realtek(R) Audio)", "{A31D68DB-866A-4CF0-889F-2A2D94B13797}", "{0.0.0.00000000}.{a31d68db-866a-4cf0-889f-2a2d94b13797}" [ALSOFT] (II) Got device "Microphone Array (Realtek(R) Audio)", "{18801BEE-F487-4B09-B019-84DB62FECA4E}", "{0.0.1.00000000}.{18801bee-f487-4b09-b019-84db62feca4e}" [ALSOFT] (II) Got device "Stereo Mix (Realtek(R) Audio)", "{95A92A4C-D024-4A68-8A24-B47675B301F5}", "{0.0.1.00000000}.{95a92a4c-d024-4a68-8a24-b47675b301f5}" [ALSOFT] (II) Watcher thread started [ALSOFT] (II) Opening playback device "OpenAL Soft" [ALSOFT] (II) Created device 0xa6b9e80, "OpenAL Soft on Realtek HD Audio 2nd output (Realtek(R) Audio)" [ALSOFT] (II) Pre-reset: Stereo, Float32, 48000hz, 960 / 2880 buffer [ALSOFT] (II) Device mix format: FormatTag = 0xfffe Channels = 2 SamplesPerSec = 48000 AvgBytesPerSec = 384000 BlockAlign = 8 BitsPerSample = 32 Size = 22 Samples = 32 ChannelMask = 0x3 SubFormat = {00000003-0000-0010-8000-00aa00389b71} [ALSOFT] (II) Requesting playback format: FormatTag = 0xfffe Channels = 2 SamplesPerSec = 48000 AvgBytesPerSec = 384000 BlockAlign = 8 BitsPerSample = 32 Size = 22 Samples = 32 ChannelMask = 0x3 SubFormat = {00000003-0000-0010-8000-00aa00389b71} [ALSOFT] (II) Post-reset: Stereo, Float32, 48000hz, 512 / 2880 buffer [ALSOFT] (II) Searching C:\myth2 for *.mhr [ALSOFT] (II) Adding built-in entry "!1_Built-In HRTF" [ALSOFT] (II) Loading !1_Built-In HRTF... [ALSOFT] (II) Detected data set format v3 [ALSOFT] (II) Loaded HRTF Built-In HRTF for sample rate 48000hz, 64-sample filter [ALSOFT] (II) 1st order + Full HRTF rendering enabled, using "Built-In HRTF" [ALSOFT] (II) Channel config, Main: 4, Real: 2 [ALSOFT] (II) Allocating 6 channels, 24576 bytes [ALSOFT] (II) Min delay: 7.75, max delay: 33.50, FIR length: 64 [ALSOFT] (II) New max delay: 25.75, FIR length: 90 [ALSOFT] (II) Max sources: 256 (255 + 1), effect slots: 64, sends: 4 [ALSOFT] (II) Dithering disabled [ALSOFT] (II) Output limiter disabled [ALSOFT] (II) Fixed device latency: 0ns [ALSOFT] (II) Post-start: Stereo, Float32, 48000hz, 512 / 2880 buffer [ALSOFT] (II) Increasing allocated voices to 256 [ALSOFT] (II) Created context 0xa6d5150 [ALSOFT] (II) Increasing allocated context properties to 2 [ALSOFT] (II) Increasing allocated voice properties to 32
Looks like it's because HRTF is being auto-enabled, since the device is being detected as headphones. The default HRTF does have a lower bass response, which would account for darker bands in the lower frequencies. You can disable it in alsoft.ini by setting:
[general]
stereo-encoding = basic # or uhj
When running with [general] stereo-encoding = basic, the sound is normal again, so I guess your theory is correct. Why is HRTF not getting enabled on other backends though?
In any case, I feel this behavior is not as it should be. With the default settings, the sound is just bad. To me it sounds like the quality has been shot, like a low bit rate mp3 or something. I think that either the default processing is buggy somehow, or if it's working as intended then it should not be a default.
As-is, since this is a user setting with no way for applications to disable it, I would hesitate to make OpenAL the default for our app because I wouldn't want users getting distorted sound by default.
HRTF can be controlled by the app using the ALC_SOFT_HRTF extension. It allows the app to enable or disable HRTF (or let it be selected or not automatically based on what the system reports the device is), and enumerate/select available HRTFs. There's also the AL_SOFT_direct_channels/AL_SOFT_direct_channels_remix extension(s) allowing individual non-3D sounds to skip panning virtualization/HRTF mixing (generally discouraged for surround formats, as HRTF will give a virtual surround effect for them, but is good for stereo sources that are already designed to work on headphones).
HRTF can be somewhat subjective. It's designed to simulate sounds coming from some external point around you, rather than sound as if it's directly on/in your ears, which can be perceived as "wrong" or "off" if that's not expected (e.g. listening to music with some audio app that completely ignores the difference between headphones and speakers, then listening to the same music with OpenAL Soft which simulates stereo speakers for stereo sources on headphones, making it more consistent between playing on headphones and speakers). And since it's a filter modeling the interaction with a head and ears, for some people it can sound more "off" depending on how different the modeled head and ears are compared to the listener. The built-in one should be a good generic HRTF; not the greatest, but not bad for most people. OpenAL Soft has the ability to use custom HRTFs, so a user can look for and pick one that works best for them individually, with config options to select it for all apps using OpenAL Soft (and the aforementioned ALC_SOFT_HRTF extension allowing individual apps to enable or select available HRTFs for themselves).
I continue to look for other HRTFs that can work as a better built-in default. The attenuated bass is something that's not great with the current default and is better in others. But between licensing (there's not many that allow unencumbered redistribution in commercial and non-commercial uses as OpenAL Soft allows; the Viking dataset being the only other ones I've found that seem to have a suitable license), and the inability to really gauge on my own how well a given one will work for most listeners (which of the 20 it provides would be a good default, if any?), I'm hesitant to change it without consulting with people that can give more insight.
Not enabling with auto-detection by default is also something I think about from time to time. Though since most OpenAL apps don't provide options to enable it, most people probably wouldn't realize it's an option they can enable for improved 3D audio on headphones, and the built-in HRTF should be okay or good for most people, it seems a good way to give an improved out-of-the-box experience with headphones.
As for why it's not getting enabled on some backends, some backends can't check that information (e.g. the PipeWire or WinMM backends can't). DirectSound technically can, but it depends on the IDirectSound::GetSpeakerConfig function returning DSSPEAKER_HEADPHONE. The WASAPI backend in contrast queries the PKEY_AudioEndpoint_FormFactor property on the IMMDevice to match Headphones or Headset.
Thank you for the in depth reply. I understand HRTF is by definition subjective but I'm skeptical about this harsh bass attenuation improving things for most people.
Anyways, I used ALC_HRTF_SOFT, ALC_FALSE at context creation and that indeed disables the HRTF.
However, is it possible for the user override this in alsoft.ini? [General]stereo-encoding=hrtf and [General]hrtf=on both seem to get overridden by the app's choice.
Ideally I don't want to make it impossible for users to use HRTF if they desire it.
Thank you for the in depth reply. I understand HRTF is by definition subjective but I'm skeptical about this harsh bass attenuation improving things for most people.
Too much bass does drown out the ability to localize a sound. But I do agree that the current default suppresses low frequencies more than is desirable, which is why I'm always on the lookout for a better one. The aforementioned Viking dataset is promising, being more modern and having an improved bass response, it's largely just a question of picking one that works well enough for most people (or finding some way to make one work better for more people).
However, is it possible for the user override this in alsoft.ini?
[General]stereo-encoding=hrtfand[General]hrtf=onboth seem to get overridden by the app's choice.Ideally I don't want to make it impossible for users to use HRTF if they desire it.
In older versions, the alsoft.ini options for HRTF would override the app request. But other developers had discussions about that saying they didn't like global options overriding the app options, leading to their games having options that do nothing for users that may have unknowingly set ini options in the past. Since an app making use of these these properties would be expected to provide the settings to users, it doesn't make much sense for the global ini to override them since the user can set what they want in the app.
That said, if I ever find games that do force specific settings without providing appropriate options to users, I won't hesitate to add override config options.
In my opinion the way things work right now is all backwards.
a) HRTF shouldn't be on by default because it results in non-standard sounds. Maybe this is more obvious in a different setting: Imagine if OpenGL and DirectX produced totally different images by default because one used an "eye related transfer function" and the other didn't. Without reproducibility it would be chaos.
b) The ini file should override the application because you can't in general rely on apps providing settings.
The way I see it, HRTF is a power user / audiophile thing that most users and developers shouldn't have to deal with. As a user I don't want it, and as a developer I don't want to write special code to disable it and then additionally have to add an obscure option to my UI just to support the one person (or possibly zero) who wants to use it.
The way things are set up currently it is being forced on us. I spent the majority of a working day trying to track down why things sounded wrong. I started out assuming it was a bug in my code and basically eliminated every possible cause of the problem before luckily discovering it only happened with one OpenAL backend.
Maybe this is more obvious in a different setting: Imagine if OpenGL and DirectX produced totally different images by default because one used an "eye related transfer function" and the other didn't.
That's not a good analogy since it's not that OpenAL Soft is inherently different, only that it recognizes when a specific device is different and tries to accommodate it. A closer analogy would be a VR headset being treated exactly like a normal monitor in Direct3D, while OpenGL gave you properly spaced and oriented per-eye renders automatically on VR headsets. In this case, an "eye related transfer function" would be equivalent to each eye getting an appropriate render, with correct offsets and focal points with overlapping fields of vision. They both know you have a VR headset and both can do proper per-eye rendering and both can enable/disable it as desired, but one does it automatically when outputting on a VR headset and the other requires the app or user to specify manually. Of course, OpenGL/Vulkan and Direct3D can't actually render proper VR automatically like that (it'd be nice if they could), but OpenAL can do headphone/HRTF rendering automatically. And it makes sense to me that if the user has the device configured as headphones, it should be accommodated as headphones instead of acting like they're plain speakers despite knowing better.
I'm not sure I agree that HRTF is a power user / audiophile thing. It's not just about quality or consistency, listening to audio meant for speakers on headphones is known to cause listening fatigue, making it tiring or uncomfortable to listen to for extended periods, as it creates an undesirable on-the-ear effect for sounds panned left or right and an abnormal in-the-head effect for sounds panned toward the center, from the lack of any natural crossfeed and filtering the brain expects to hear. It's similar to how users get eyestrain when graphics don't correctly account for VR displays needing proper per-eye renders, as the eyes can't focus as they expect to. It's not what our brains expect and causes a physical strain on the senses to deal with it.
Most games don't call out HRTF specifically as that's a more technical term, it's often just labelled as "Headphones" to make it clear for people to select it if that's what they have. In other cases, a game may simply use Windows Sonic or Dolby Atmos, and those can provide an HRTF mix over headphones (and similarly not mention "HRTF" explicitly, and just call it "... for Headphones" or whatever).
That's my view of it anyway. I am just one person, and while I have gotten positive and supportive feedback from others, I won't say I know I'm among the majority. If enough people feel the same as you do, it's worth taking into account. Maybe a better built-in HRTF with more bass won't be perceived as poorly when auto-enabled, and it would be worth just trying an alternative. There are options other than HRTF for headphones too, like BS2B (it's technically an HRTF, but not in the sense that is generally understood). It's subtler and less exact, only aiming to reduce the issues caused by listening to plain stereo over headphones rather than provide a proper headphone mix, but it may be a better compromise. OpenAL Soft has an option for that.
b) The ini file should override the application because you can't in general rely on apps providing settings.
That apps generally don't provide settings is exactly why the ini shouldn't override apps. Since most apps don't provide settings and just use defaults, users will rely on using the ini to change what they want for those apps. But then by setting those options in the ini to handle those apps, other apps that do provide options would find their options overridden, even if the user wants to change it in the app. That would make it pointless to have the option in-game, thus further necessitating using the ini to change anything, and make me question why the app has any ability to configure options if you'll end up needing to use the ini for everything anyway.
An external override should be seen as a sledgehammer, a last resort to knock a misbehaving app into shape. A user preference shouldn't be conflated with a forced override. For instance, OpenAL Soft's default resampler is Cubic Spline, as it's quite efficient and does a good enough job for most people. However, I like to use the BSinc48 resampler by default, a pretty demanding but high quality resampler. So I set that in my global ini (conf, on Linux) and apps that use OpenAL will use BSinc48 by default. But some games have particularly low samplerate sounds that were designed around decades old hardware, so I can select Nearest or Linear resampling in game to make them sound less muffled and more like they originally did. Or a game may use an abnormally high number of sound sources that start having a noticeable performance impact, so for that game I can select Cubic Spline to trade some quality back for performance. If the global ini was treated as an override, in-game resampler options would be useless and I would have to use per-executable ini files to override the global ini (which may not be easy to set up and use, depending on the game), rather than simply select what I want in each game when it differs from my normal preference.
Thanks for the detailed reply again. I have a better idea of where you are coming from.
I still don't agree regarding (a) though. I think it's reasonable to expect that all audio libraries will sound identical by default. I don't think OpenAL should be applying its own judgments just because it detects a device. The current behavior would make sense if HRTF was an operating system concept and there was a system wide HRTF that all sounds were expected to use. We don't live in that world, which is why I see HRTF as something that should be only by opt-in.
OpenAL sounding different by default has consequences. It is an unpleasant surprise. It means the difference needs to be tracked down and debugged.
This wouldn't be so bad if:
- it was easy to figure out the reason for the difference and
- there was a simple solution to get the normal behavior
Neither is true here. In this case it is particularly difficult to track down because the difference comes from an obscure extension, ALC_HRTF_SOFT. (Perhaps the existence of this github issue will help make it more searchable at least). The solution given, passing ALC_HRTF_SOFT, FALSE then creates an additional problem which is that people who want HRTF can no longer enable it.