python-soundfile WAV format with MS_ADPCM subtype always reads back a longer array than what was written

trafficstars

We are running into a funky little issue using soundfile in a different project here: https://github.com/justinsalamon/scaper/issues/94. Basically, when an audio file is WAV with subtype MS_ADPCM, the shape of the numpy array we write to disk and the shape of the numpy array we read back from disk are not the same.

import soundfile as sf
import numpy as np
import tempfile

_format = 'WAV'
_subtype = 'MS_ADPCM'

sr = 16000

for l in [1600, 16000, 32000, 64000, 100000, 128000]:
    original_shape = (l,)
    audio = np.zeros(original_shape)

    with tempfile.NamedTemporaryFile(suffix='.wav', delete=True) as tmpfile:
        sf.write(tmpfile.name, audio, sr, subtype=_subtype, format=_format)
        audio, sr = sf.read(tmpfile.name)
        print('What I got back from sf.read\t', audio.shape)
        print('What I meant to write to disk\t', original_shape)
        print()

Running the code above results in the following bizarre set of input and output shapes that I can't make heads or tails of:

What I got back from sf.read     (2024,)
What I meant to write to disk    (1600,)

What I got back from sf.read     (16192,)
What I meant to write to disk    (16000,)

What I got back from sf.read     (32384,)
What I meant to write to disk    (32000,)

What I got back from sf.read     (64768,)
What I meant to write to disk    (64000,)

What I got back from sf.read     (100188,)
What I meant to write to disk    (100000,)

What I got back from sf.read     (128524,)
What I meant to write to disk    (128000,)

I thought I'd open an issue about it. Let me know if you need any more information!

Mar 05 '20 05:03 pseeth

From your examples, I would assume that there is a built-in block size in MS_ADPCM, likely 1012 frames. This is either a built-in assumption of MS_ADPCM, or a limitation of libsndfile. Either way, there is probably nothing you can do within SoundFile short of using a different subtype.

(What process requires MS_ADPCM files? I have never heard of that format.)

Mar 05 '20 07:03 bastibe

Interesting - sounds about right. And neither have I until the bug in the linked issue was reported. Tracking it down led me to this behavior in SoundFile/libsndfile, so I thought I'd just report to keep a record of it somewhere.

We are using audio files from https://freesound.org/. freesound has really no restrictions on the formats or types of audio files you upload. So I'm guessing the file in question that had that subtype came from some sort of niche field recorder or perhaps an old phone or something like that, got encoded in a strange fashion, and then uploaded. But reformatting the file using ffmpeg into a more common format fixed that issue.

Thanks!

Mar 05 '20 07:03 pseeth

python-soundfile python-soundfile copied to clipboard

WAV format with MS_ADPCM subtype always reads back a longer array than what was written

python-soundfile
python-soundfile copied to clipboard