audioread icon indicating copy to clipboard operation
audioread copied to clipboard

gstdec backend adds delay at the start when decoding a .mp3

Open MZehren opened this issue 5 years ago • 2 comments

Hello,

When using the script decode.py to decode an mp3 into wav, the .wav file generated has 33ms of silence prepend at the start. Thus the original .mp3 and the decoded .wav are really not the same.

The output of the script is:

Input file: 2 channels at 44100 Hz; 452.7 seconds. Backend: gstdec

A ffprobe of the original .mp3 returns:

$ ffprobe Black\ Blood\ -\ Aiea\ Mwana\ (T.Kolai\ Special\ Edit).mp3 ffprobe version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2007-2019 the FFmpeg developers built with gcc 7 (Ubuntu 7t.3.0-16ubuntu3) configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 Input #0, mp3, from 'Black Blood - Aiea Mwana (T.Kolai Special Edit).mp3': Metadata: NITR : NTKB?&? title : Aiea Mwana artist : Black Blood comment : Afro-latino; xHD encoder : Lavf57.83.100 TKEY : 11m TBPM : 122 Duration: 00:07:32.68, start: 0.025057, bitrate: 128 kb/s Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 128 kb/s Metadata: encoder : Lavc57.10

I don't have the same issue when using Madmom library to read files which relies on the ffmpeg backend I believe. Where does this silence come from?

MZehren avatar Feb 13 '20 13:02 MZehren

Huh, that's interesting! It would be worth digging into more deeply, but at the moment, I don't have an intuition for why GStreamer might be doing this. It might require some deeper GStreamer knowledge than I have to help explain it…

sampsyo avatar Feb 13 '20 19:02 sampsyo

gstreamer doesn't support gapless decoding for mp3, which means there can be (very short) extra silence at the beginning and end of streams compared to other tools.

lazka avatar Jul 16 '21 05:07 lazka