audio
audio copied to clipboard
Loading a BytesIO opus file does not seem to work
🐛 Bug
Loading a BytesIO opus file does not seem to work.
To Reproduce
Steps to reproduce the behavior:
import torchaudio
import io
print(torchaudio.__version__)
# samples from https://docs.espressif.com/projects/esp-adf/en/latest/design-guide/audio-samples.html
torchaudio.load("ff-16b-2c-44100hz.mp3")
torchaudio.load("ff-16b-2c-44100hz.opus")
def file_like(filepath):
return io.BytesIO(open(filepath, "rb").read())
torchaudio.load(file_like("ff-16b-2c-44100hz.mp3"), format="mp3")
# this crashes with
# formats: can't open input file `': Input not an Ogg Opus audio stream
# Traceback (most recent call last):
# File "test.py", line 14, in <module>
# torchaudio.load(file_like("ff-16b-2c-44100hz.opus"), format="opus")
# File "/Users/csh/miniconda3/lib/python3.7/site-packages/torchaudio/backend/sox_io_backend.py", line 150, in load
# filepath, frame_offset, num_frames, normalize, channels_first, format)
# RuntimeError: Error loading audio file: failed to open file <in memory buffer>
torchaudio.load(file_like("ff-16b-2c-44100hz.opus"), format="opus")
Expected behavior
It seems like it should work as well as it does for the mpi3 case
Environment
- What commands did you used to install torchaudio (conda/pip/build from source)?
- pip
- If you are building from source, which commit is it?
- What does
torchaudio.__version__print? (If applicable) - 0.9.0
Please copy and paste the output from our environment collection script (or fill out the checklist below manually).
You can get the script and run it with:
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
- PyTorch Version (e.g., 1.0): 1.9.0
- OS (e.g., Linux): mac os x
- How you installed PyTorch (
conda,pip, source): pip - Build command you used (if compiling from source):
- Python version: 3.7.3
- CUDA/cuDNN version:
- GPU models and configuration:
- Any other relevant information:
Additional context
Hi @christopherhesse
Can you try increasing the buffer size?
torchaudio.utils.sox_utils.set_buffer_size(16000)
This happens because the header size of the the OPUS file is larger than the default buffer size that torchaudio uses to read the header. OPUS format is tricky and it allows arbitral size of header. The officially recommended header size is bellow 6k, and torchaudio uses 4k for default buffer size, but this file seems to have 16k.
@hwangjeff I think we can issue a warning if it fails to load opus file from byte buffer.
Thanks @mthrok that does immediately fix the loading error, however I now get a different issue shown here:
import torchaudio
import io
torchaudio.utils.sox_utils.set_buffer_size(16000)
print(torchaudio.__version__)
# samples from https://docs.espressif.com/projects/esp-adf/en/latest/design-guide/audio-samples.html
torchaudio.load("ff-16b-2c-44100hz.mp3")
data, sample_rate = torchaudio.load("ff-16b-2c-44100hz.opus")
print(data.shape, sample_rate)
def file_like(filepath):
return io.BytesIO(open(filepath, "rb").read())
torchaudio.load(file_like("ff-16b-2c-44100hz.mp3"), format="mp3")
data, sample_rate = torchaudio.load(file_like("ff-16b-2c-44100hz.opus"), format="opus")
print(data.shape, sample_rate)
The output of this script is:
0.9.0
torch.Size([2, 8980158]) 48000
torch.Size([2, 143688]) 48000
The odd thing is that this is the same file each time, so the data shape should be the same both times.
Thanks @christopherhesse for the report. That indeed looks strange, and I confirm that I observed the same issue on my env. I will look into it.
We have removed file-like object support from libsox, and now this is handled by ffmpeg backend and it seems to work fine. I will close this issue.