SDL_sound Decoding a particular OGG file appends a bit of silence at the end

There's a problem with decoding a particular 48 kHz OGG file: SDL_Sound appears to append a small bit of silence at the end (around 5-6 ms). This would not be a concern if a sound would play once, but when looping this extra bit causes an audible crackling noise.

Decoding is done without any requested conversion, passing null as desired format parameter.

If I convert the file itself to 44100 Hz, then it is loaded fine, without extra appends.

The original file: VacuumNoise.zip

I used a small program to write SDL_Decode results to a file "raw". This "raw" file may be imported in a software like "Audacity", and a trailing silence observed: test.raw.zip Properties: 32-bit float, 48 kHz, 2 Channels, Little-endian.

The code used to decode and write this file is under the spoiler.

#include <stdio.h>
#include <SDL.h>
#include <SDL_sound.h>

const size_t SampleDefaultBufferSize = 64 * 1024;

int main(int argc, char *argv[])
{
    Sound_Init();

    Sound_Sample * sample = Sound_NewSampleFromFile("VacuumNoise.ogg", NULL, SampleDefaultBufferSize);
    if (!sample)
        return -1;

    FILE *f = fopen("test.raw", "wb");
    if (!f)
    {
        Sound_FreeSample(sample);
        return -2;
    }

    do
    {
        size_t sz = Sound_Decode(sample);
        if (sz > 0)
        {
            fwrite(sample->buffer, 1, sz, f);
        }
    } while ((sample->flags & SOUND_SAMPLEFLAG_EOF) == 0);

    fclose(f);
    Sound_FreeSample(sample);

    return 0;
}

Dec 01 '23 02:12 ivan-mogilko

Here's pair AGS issue: https://github.com/adventuregamestudio/ags/issues/2244

Made a little repository to ease to reproduce this

https://github.com/ericoporto/sdl_sound_issue_92

In this image, above track is the original file and on the bottom is the track from the decoded raw from SDL_sound.

Uhm, ~~I wonder if this right here should be an end of file flag~~, tested and it seems changing this to EOF flag doesn't make a difference.

https://github.com/icculus/SDL_sound/blob/c5639414c1bb24fb4eef5861c13adb42a4aab950/src/SDL_sound_vorbis.c#L176-177

Sound_GetDuration also gives just internal->total_time; which doesn't appear to use anything like length in ms reading of this from the sound file header and some way to know where in this length the sound is. So it just gives a value that is wrong and not of use...

I think then it's in stb_vorbis itself in stb_vorbis_get_samples_float_interleaved function.

Jan 18 '24 01:01 ericoporto

Not sure yet what is the issue, but I noticed this truncation while debugging

https://github.com/icculus/SDL_sound/blob/fdcecafbfdcbdd2b77c5b591948c237df19c864e/src/stb_vorbis.h#L3603-L3604

The last time we get here in the debugger, len is 0 and right is 128, and left is 0. Not sure yet what this means... But it feels like something wrong somewhere in stb_vorbis. Looking at the raw file in an hexeditor I don't see an obvious sequence of zeroes there, so I don't know exactly how that is appearing when seeing in audacity.

Stb vorbis has a special treatment when it opens the first part of the vorbis sample that contain it's header parts, I wonder if it could get it's length there and maintain it, at least as a sanity check.

Also I start to think it would be better to have an option to use Xiph vorbis library instead of stb. :/

Way too many revised PRs lingering in stb without being merged.

Jan 19 '24 13:01 ericoporto

OK, I tried throwing a stb_vorbis_stream_length_in_seconds and the results match correctly the original file and not the raw. I think SDL_sound should use stb_vorbis_stream_length_in_samples (edit: it does but doesn't use the value see below) somewhere inside to keep track of things and make it not write more than it really exists in samples.

I noticed this is used in SDL_mixer, so I think this may have been noticed in the past there, but probably a long time ago.

Here in the repo there's a similar call for sanity check for the function similar to SDL_mixer, but the difference here is the returned total length in samples is not stored anywhere where it could be used to check later.

If the length that is used is instead in seconds it's better to use a double instead of the current milliseconds as int, or it will lose samples and loop poorly.

Jan 19 '24 14:01 ericoporto

Also I start to think it would be better to have an option to use Xiph vorbis library instead of stb

I noticed SDL_mixer has this as an option, to use xiph libraries instead for ogg, would it be feasible to add this to SDL_sound?

Probably by building on top of the old v1.0 https://github.com/icculus/SDL_sound/blob/stable-1.0/decoders/ogg.c

Feb 17 '24 19:02 ericoporto

Should be fixed now that #97 is in

Jan 12 '25 22:01 sezero

I can confirm that the reported issue is fixed, thank you.

Jan 13 '25 19:01 ivan-mogilko

You're welcome

Jan 13 '25 19:01 sezero