Inaccurate silence calculation for MP3 files
When testing our audio chain with different file formats I found that silan does return the different results, depending on the file format. I was expecting small disparities, but the result when analyzing an MP3 file was off by more than half a second.
The test audio starts with 200ms silence, followed by 600ms noise, followed by 1200ms silence. I used FLAC, ogg vorbis and MP3 as file formats. This was the output by silan:
$ silan padded.mp3
0.367868 Sound On
1.677347 Sound Off
$ silan padded.ogg
0.197891 Sound On
0.825397 Sound Off
$ silan padded.flac
0.200023 Sound On
0.825374 Sound Off
As you can see, the ogg vorbis and FLAC files are okay, but the MP3 is off.
I checked the individual files with a spectrum analyzer, and they seem to be okay.
MP3:

Ogg Vorbis:

FLAC:

Here is a zip with the three audio files: padded-audio.zip
I tried out different different silan options, but no luck so far.
The following versions were used:
$ silan --version
silan version 0.4.0
$ pkg-config --modversion sndfile
1.0.28
$ ffmpeg -version
ffmpeg version 4.0.3 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 8 (GCC)
configuration: --prefix=/usr --bindir=/usr/bin --datadir=/usr/share/ffmpeg --docdir=/usr/share/doc/ffmpeg --incdir=/usr/include/ffmpeg --libdir=/usr/lib64 --mandir=/usr/share/man --arch=x86_64 --optflags='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' --extra-ldflags='-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld ' --extra-cflags=' ' --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libvo-amrwbenc --enable-version3 --enable-bzlib --disable-crystalhd --enable-fontconfig --enable-frei0r --enable-gcrypt --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libcdio --enable-libdrm --enable-indev=jack --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libmp3lame --enable-nvenc --enable-openal --enable-opencl --enable-opengl --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libv4l2 --enable-libvidstab --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-libzvbi --enable-avfilter --enable-avresample --enable-postproc --enable-pthreads --disable-static --enable-shared --enable-gpl --disable-debug --disable-stripping --shlibdir=/usr/lib64 --enable-libmfx --enable-runtime-cpudetect
libavutil 56. 14.100 / 56. 14.100
libavcodec 58. 18.100 / 58. 18.100
libavformat 58. 12.100 / 58. 12.100
libavdevice 58. 3.100 / 58. 3.100
libavfilter 7. 16.100 / 7. 16.100
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 1.100 / 5. 1.100
libswresample 3. 1.100 / 3. 1.100
libpostproc 55. 1.100 / 55. 1.100
I can't reproduce this here on debian/stable with ffmpeg 3.2.12 , libavcodec 57.64.101
$ ./src/silan /tmp/padded/padded.mp3
0.191927 Sound On
0.830272 Sound Off
$ ./src/silan /tmp/padded/padded.ogg
0.197891 Sound On
0.825397 Sound Off
$ ./src/silan /tmp/padded/padded.flac
0.200023 Sound On
0.825374 Sound Off
seems like an ffmpeg, avcodec related issue.
Hm. I'll see if i can dig a bit deeper. Thanks for trying out.
It seems the API avcodec_decode_audio4() was already deprecated again in recent ffmpeg. Perhaps the wrapper function that ffmpeg4.x provides for the old API does not handle stereo or joint stereo correctly!?
I guess silan's ffmpeg audio-decoder needs to use the new avcodec_receive_frame() API with ffmpeg4.x ; libavcodec 58.x
I tried both, full stereo and joint stereo, and get strange values in both cases:
$ file padded-jointstereo.mp3 padded-stereo.mp3
padded-jointstereo.mp3: Audio file with ID3 version 2.4.0, extended header, contains:MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, JntStereo
padded-stereo.mp3: Audio file with ID3 version 2.4.0, extended header, contains:MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, Stereo
$ silan padded-jointstereo.mp3
0.367868 Sound On
1.677347 Sound Off
$ silan padded-stereo.mp3
0.093583 Sound On
1.677687 Sound Off
Here are the two MP3 files: padded-mp3s.zip
@x42 as I tried using the audio_decoder library on my own, I found decoding to be bad on FFmpeg 4.1.2. The MP3 decoding is generating junk output.
For example is the sample recording of /usr/share/sounds/alsa/Front_Center.wav, vs LAME conversion of the same file.
