audioread icon indicating copy to clipboard operation
audioread copied to clipboard

Compressed audio (continuation of #36)

Open sampsyo opened this issue 8 years ago • 16 comments

This is @jksinton's work on making the FFmpeg backend work on data from a stream instead of from disk. Addresses #34 and #35.

I changed around the top-level interface a bit. There's a new audioread.decode function that works on streams instead of filenames. The included demo script, decode.py, can now work on data from stdin. So this works, for example:

$ python decode.py < foo.mp3

and produces a WAV file in out.wav. (Eventually, it would be nice to write that to stdout too to demonstrate that we're actually streaming!)

sampsyo avatar Oct 26 '16 22:10 sampsyo

Cool! This is looking great.

Do you think we should ship this with just one backend enabled (and document that fact), or should we endeavor to add the streaming mode to more backends? (I don't have a strong feeling either way.)

If the consensus is to merge, then all that's left is documentation.

sampsyo avatar Oct 27 '16 02:10 sampsyo

I was going to suggest working on other backends, just so that there's parity in backend support for decoding a file and compressed audio. I only have Linux and Mac OS X environments for testing. And so far, I've only been testing with Linux/Python 2.7.

jksinton avatar Oct 27 '16 02:10 jksinton

That sounds good. Some backends will probably more challenging than others to adapt—for example, the Core Audio backend will require fiddling with callbacks to read data from the file.

sampsyo avatar Oct 27 '16 11:10 sampsyo

Processing audio from pipe doesn't work for me with python3 on Ubuntu 16.04.1

$ python3 decode.py < test_cbr80_mono.mp3
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/home/user/Desktop/folder/audioread/ffdec.py", line 93, in run
    data = self.readfile.read(self.blocksize)
  File "/usr/lib/python3.5/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 94: invalid start byte

With python2 it works, but print out wrong length:

$ python2 decode.py < test_cbr80_mono.mp3
Input file: 1 channels at 44100 Hz; 0.0 seconds.
Backend: ffdec

With filename as input both python's versions works correct (same file and ffmpeg as backend):

$ python2 decode.py test_cbr80_mono.mp3
Input file: 1 channels at 44100 Hz; 10.0 seconds.
Backend: ffdec
$ python3 decode.py test_cbr80_mono.mp3
Input file: 1 channels at 44100 Hz; 10.0 seconds.
Backend: ffdec

  Also with python2 and pipe as input I've got different output file size than with filename mode:
Pipe-input mode: 10.006 secs, 882570 bytes
Filename-input mode (python2 and python3): 10.000 secs, 882044 bytes

strayge avatar Nov 14 '16 22:11 strayge

Good point; it looks like we'll need to read from stdin in binary mode instead of text mode. If you're interested, any chance you could give that a shot? This SO answer might help: http://stackoverflow.com/a/32282458

sampsyo avatar Nov 14 '16 22:11 sampsyo

Just adding .buffer to stdout hasn't fixed problem:

Traceback (most recent call last):
  File "decode.py", line 65, in <module>
    decode()
  File "decode.py", line 38, in decode
    f = audioread.decode(sys.stdin)
  File "/home/user/Desktop/folder/audioread/__init__.py", line 126, in decode
    return ffdec.FFmpegAudioFile(audio=audio)
  File "/home/user/Desktop/folder/audioread/ffdec.py", line 195, in __init__
    self.stdout_reader = QueueReaderThread(self.proc.stdout.buffer, block_size)
AttributeError: '_io.BufferedReader' object has no attribute 'buffer

strayge avatar Nov 14 '16 22:11 strayge

After changing

data = self.readfile.read(self.blocksize)

to

try:
    data = self.readfile.buffer.read(self.blocksize)
except AttributeError:
    data = self.readfile.read(self.blocksize)

in ffdec.py it works with python2 & python3. Both versions generates file with length = 10.006 secs and size = 882570 bytes.

strayge avatar Nov 14 '16 22:11 strayge

Cool! Thanks for sorting this out.

I think the right fix here is to use an if PY2: check in decode.py to get a bytes stream from stdin. Then we should document the behavior in the main module: the input stream needs to be a bytes stream, not a text stream.

sampsyo avatar Nov 14 '16 23:11 sampsyo

Does this stem from decode.py calling audioread.decode(sys.stdin)?

jksinton avatar Nov 14 '16 23:11 jksinton

Does this stem from decode.py calling audioread.decode(sys.stdin)?

Yes  

I think the right fix here is to use an if PY2: check in decode.py to get a bytes stream from stdin. Then we should document the behavior in the main module: the input stream needs to be a bytes stream, not a text stream.

I didn't found PY2 variable (just ugly if sys.version_info[0] > 2), and docs says:

Inevitably you will have code that has to choose what to do based on what version of Python is running. The best way to do this is with feature detection of whether the version of Python you’re running under supports what you need.

so, it can be rewritten as:

if 'buffer' in dir(self.readfile):
    data = self.readfile.buffer.read(self.blocksize)
else:
    data = self.readfile.read(self.blocksize)

strayge avatar Nov 14 '16 23:11 strayge

Also reproduced bug with different sizes. It seems like ffmpeg bug:

$ ffmpeg -i test_cbr80_mono.mp3 -f s16le - > out.wav
  ...
[mp3 @ 0x1b1a400] Skipping 0 bytes of junk at 495.
Input #0, mp3, from 'test_cbr80_mono.mp3':
  Metadata:
    ...
  Duration: 00:00:10.03, start: 0.025057, bitrate: 80 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, mono, s16p, 80 kb/s
    Metadata:
      encoder         : LAME3.99r
Output #0, s16le, to 'pipe:':
  Metadata:
    ...
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
    Metadata:
      encoder         : Lavc56.60.100 pcm_s16le
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
size=     861kB time=00:00:10.00 bitrate= 705.6kbits/s    
video:0kB audio:861kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
$ cat test_cbr80_mono.mp3 | ffmpeg -i - -f s16le - > out.wav
  ...
[mp3 @ 0x17fa400] invalid concatenated file detected - using bitrate for duration
[mp3 @ 0x17fa400] Skipping 0 bytes of junk at 495.
Input #0, mp3, from 'pipe:':
  Metadata:
    ...
  Duration: N/A, start: 0.025057, bitrate: 80 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, mono, s16p, 80 kb/s
    Metadata:
      encoder         : LAME3.99r
Output #0, s16le, to 'pipe:':
  Metadata:
    ...
    Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
    Metadata:
      encoder         : Lavc56.60.100 pcm_s16le
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 (native) -> pcm_s16le (native))
size=     862kB time=00:00:10.00 bitrate= 705.6kbits/s    
video:0kB audio:862kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%

strayge avatar Nov 15 '16 01:11 strayge

Yeah, seems like ffmpeg might work slightly differently depending on where its data comes from.

Thanks for looking into this!

sampsyo avatar Nov 15 '16 01:11 sampsyo

Adding this to decode.py, accounts for the correct sys.stdin input:

        if filename:
            f = audioread.audio_open(filename)
        elif sys.version_info >= (3, 0):
            f = audioread.decode(sys.stdin.buffer)
        else:
            f = audioread.decode(sys.stdin)

I tried catching the UnicodeDecodeError in decode.py with an additional except UnicodeDecodeError: block, but it wasn't working. I suspect this is because the error is generated in ffdec.py. Perhaps we can catch the UnicodeDecodeError in ffdec.py, provide a more pertinent error message, and raise an error that can be caught, like the DecodeError.

jksinton avatar Nov 15 '16 03:11 jksinton

Perhaps we can catch the UnicodeDecodeError in ffdec.py, provide a more pertinent error message, and raise an error that can be caught, like the DecodeError.

As I wrote above

if 'buffer' in dir(self.readfile):
    data = self.readfile.buffer.read(self.blocksize)
else:
    data = self.readfile.read(self.blocksize)

is more pythonic ways than if sys.version_info >= (3, 0)

Also if we adding this check to ffdec.py it works with both python's versions and allowed to call audioread without decode.py

strayge avatar Nov 15 '16 04:11 strayge

I don't have a strong opinion about feature detection (if 'buffer' in dir(self.readfile)) vs. version detection (if sys.version_info >= (3, 0)). Either one seems fine from here.

But I do think this check should go in decode.py. Specifically, I would like to avoid requiring each backend to check whether the stream it was passed was a bytes stream or a text stream and, if the latter, if it's the special kind of text stream that happens to have an underlying buffer. Instead, I think documentation is the way to go: we should just make it as clear as possible what should be provided.

sampsyo avatar Nov 15 '16 05:11 sampsyo

Status?

carlthome avatar Jun 07 '17 14:06 carlthome