Popping when using default HRTF
System Information
- Steam Audio version: 4.5.3
- Operating System and version: Windows 10 22H2 19045.4355
- (Optional) CPU architecture (e.g. x86-64, armv7): x86_64
Issue Description
When using IPLBinauralEffect and IPLDirectEffect in distance attenuation mode, I occasionally get crackle and popping artifacts. When I dump out the resulting in-flight buffers and graph them in Desmos, I found that sometimes the samples are well beyond the [-1,1] range.
Link to desmos graph: https://www.desmos.com/calculator/xlpsejnitp
This is the code I am using:
const auto nchannels = AudioPlayer::GetNChannels();
// render it
IPLfloat32* inputChannels[]{ monoSourceData.data() };
static_assert(std::size(inputChannels) == 1, "Input must be mono!");
IPLAudioBuffer inBuffer{
.numChannels = 1,
.numSamples = IPLint32(monoSourceData.GetNumSamples()),
.data = inputChannels,
};
Debug::Assert(buffer.GetNChannels() == 2, "Non-stereo output is not supported");
IPLfloat32* outputChannels[]{
buffer[0].data(),
buffer[1].data()
};
IPLAudioBuffer outputBuffer{
.numChannels = nchannels,
.numSamples = IPLint32(buffer.GetNumSamples()),
.data = outputChannels
};
auto sourcePosInListenerSpace = vector3(invListenerTransform * vector4(sourcePos,1));
auto normalizedPos = glm::normalize(sourcePosInListenerSpace);
IPLBinauralEffectParams params{
.direction = { normalizedPos.x,normalizedPos.y,normalizedPos.z },
.interpolation = IPL_HRTFINTERPOLATION_BILINEAR,
.spatialBlend = 1.0f,
.hrtf = GetApp()->GetAudioPlayer()->GetSteamAudioHRTF(),
.peakDelays = nullptr
};
auto result = iplBinauralEffectApply(effects.binauralEffect, ¶ms, &inBuffer, &outputBuffer);
// do distance attenuation in-place
IPLDistanceAttenuationModel distanceAttenuationModel{
.type = IPL_DISTANCEATTENUATIONTYPE_DEFAULT
};
IPLDirectEffectParams directParams{
.flags = IPL_DIRECTEFFECTFLAGS_APPLYDISTANCEATTENUATION,
.distanceAttenuation = iplDistanceAttenuationCalculate(state.context,{sourcePosInListenerSpace.x,sourcePosInListenerSpace.y,sourcePosInListenerSpace.z},{0,0,0},&distanceAttenuationModel)
};
result = iplDirectEffectApply(effects.directEffect, &directParams, &outputBuffer, &outputBuffer);
I am using the default SteamAudio HRTF.. The source samples are within the [-1,1] range. Is it expected for SteamAudio to produce samples out of bounds?
@Ravbug If the input samples are already close to +/- 1, then it's possible that the HRTF will cause the output samples to be slightly outside the [-1, 1] range, but based on the graph, that's not the issue here. Can you try creating the context with validation enabled (see here) and see if that indicates any invalid/NaN inputs?
I passed IPL_CONTEXTFLAGS_VALIDATION, and I didn't get any asserts or additional logs. The out-of-range values are in the ballpark of 10-20 so it doesn't seem like UB to me, but I could be wrong.
Output sample values in the 10-20 range are definitely not expected. Do you happen to know what the source and listener positions were when these out-of-range sample values were generated? I want to make sure I'm able to reproduce the conditions under which you're encountering the issue. Thanks!
I reproduced it with these positions, in world space:
source pos: (0,0,0)
listener pos: (2.50609, 9.20201, 0)
listener rotation (quaternion xyzw): (-0.395396, 0.586227, 0.395396, 0.586227)
I noticed that the source audio (before SteamAudio sees it) has a couple of values in the [-20,20] range so this could partially be a case of garbage in -> garbage out, however, the buffers that SteamAudio produces have many more out-of-range samples than the input data. These samples also don't always align with the input out-of-range samples. Is this expected?
Here is another Desmos graph with the source buffer and resulting SteamAudio output buffers for this capture: https://www.desmos.com/calculator/14biausk1g
The all seems reasonable. Input seems to be in the range of [-30 to 20] and output from Steam Audio is in similar range. Samples don't align because HRTF is basically and FIR filter which adds delay and does a weights sum of various samples to generate output at a given sample.