audioread Backend producing padded byte arrays

When working with mp3 files of indeterminate length, or ones that end on an odd number of PCM frames when decoded, as would be the case for variable bit rate MP3 streams, the yield statement occassionally returns zero.

with audioread.audio_open('workfile.mp3') as input_file:
    for data in input_file:
        # data is zero padded

Sample output for the above function might look like the following:

\xecD\xe4\xe2\xee\xd7\xe6\xd3\xf1\x9b\xe8\x99\xf4K\xe8S\xf5X\xe7\x8b\xf3\xe0\xe5\xd7\xf0\xf5\xe3\xf5\xee\xc8\xe3\xb3\xee\xa6\xe6\x0f\x0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

This issue either needs documentation, or consideration for any "streaming" solution.

Nov 17 '16 23:11 jmercouris

Huh; that's interesting. Do you know which backend produced that result?

Also, do you know whether this is avoidable? Correct me if I'm wrong, but it seems like this is the inherent quality of compressed audio that "gapless playback" features are supposed to avoid. So it's possible we can't sidestep this effect without doing more heavyweight analysis.

Nov 18 '16 03:11 sampsyo

I'm not sure which backend produced it. I installed one but dont remember which. I can also provide a sample mp3 file that produces this behavior for demonstration.

Nov 18 '16 11:11 jmercouris

A sample would be great!

You can also use print(input_file) to see which backend is being used.

Nov 18 '16 17:11 sampsyo

Thank you for the tip, it appears that I’m using the Macca backend, the behavior is strange to explain, the offset depends on the bitrate and length of the mp3 file, I’m not sure if it has something to do with incomplete frames being persisted and then converted, but occasionally the file is padded.

I’ve been trying to figure out how to recreate it or to fix it. As you alluded to earlier, this may be unavoidable without additional processing. I believe the problem is that the audio decoding happens on a file by file basis. What this means is that I have to be sure I am writing a file that has a complete mp3 frame at the end, or I will have this weird padding. In order to know where the frame ends, I have to do some audio decoding, in order to audio decoding, I have to have a file, and you can see how the problem is circular.

I’ll work on getting a sample mp3 file that exhibits this behavior.

On Nov 18, 2016, at 18:16, Adrian Sampson [email protected] wrote:

A sample would be great!

You can also use print(input_file) to see which backend is being used.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/beetbox/audioread/issues/39#issuecomment-261587712, or mute the thread https://github.com/notifications/unsubscribe-auth/ABnQDvLBiIPxBVSqw_BaYj7hKN1tsB03ks5q_d13gaJpZM4K19g0.

Nov 18 '16 19:11 jmercouris

Yeah, that sounds like a good description of what's going on. This paragraph on Wikipedia describes the same phenomenon: https://en.wikipedia.org/wiki/Gapless_playback#Compression_artifacts

Sounds like it is a pretty complicated issue!

Nov 18 '16 21:11 sampsyo

I found an MP3 frame parser https://github.com/kirkeby/python-mp3/blob/master/src/mp3/init.py https://github.com/kirkeby/python-mp3/blob/master/src/mp3/init.py

I’m going to try to use it to build a streaming solution that segments streams into files of the correct length to avoid empty or invalid frames.

This is less than ideal, but imagine that you could use AudioRead with all existing backends by taking a mp3 or file stream, and segmenting it into intervals of frames that can be processed, ideally something like 4096, and then you can output pcm data or whatever though pyaudio or anything else you like

The correct way to do this involves actually streaming into core audio, and I did find code that shows how to do this: https://github.com/mattgallagher/AudioStreamer https://github.com/mattgallagher/AudioStreamer

Nov 18 '16 22:11 jmercouris

Huh! That sounds like an interesting project. Keep us updated!

About streaming data into Core Audio: this is related to #35, where we're exploring mechanisms to decode streaming data from memory instead of from the filesystem.

Nov 18 '16 22:11 sampsyo

Right all of these issues are intertwined, the whole reason I am doing this is because AudioRead doesn’t support streaming for all backends and this could be an interim quick/dirty solution

Nov 18 '16 23:11 jmercouris

audioread audioread copied to clipboard

Backend producing padded byte arrays

audioread
audioread copied to clipboard