python-soundfile icon indicating copy to clipboard operation
python-soundfile copied to clipboard

libsndfile (soundfile) for mp3 not float32 but float64

Open magicse opened this issue 3 years ago • 16 comments
trafficstars

https://github.com/librosa/librosa/issues/1584 Audio out from libsndfile (soundfile) for mp3 not float32 but float64. Because of this, if we do not force the dtype=float64 , we get an empty array

print mp3 with dtype set float32 by default audio_test, _ = librosa.load('./g.mp3', mono=False, res_type='kaiser_fast',sr=sr)

[]
0
(2, 0)
tensor([], size=(2, 0))

print mp3 with dtype =float64 audio_test, _ = librosa.load('./g.mp3', mono=False, res_type='kaiser_fast',sr=sr, dtype=np.float64)

[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
21182464
(2, 10591232)
tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

magicse avatar Sep 29 '22 13:09 magicse

mp3 file for test abba

If reading directly from soundfile with dtype=float32 , we will get empty array. And with dtype=float64 we will get filled array.

import soundfile as sf
with sf.SoundFile(audio_path, "r") as file_:
        frames = file_._prepare_read(0, None, -1)
        #dtype = "float64"
        dtype = "float32"
        waveform = file_.read(frames, dtype, always_2d=True)
        sample_rate = file_.samplerate
        print(sf.info(audio_path, verbose=True))
        print(sf.available_subtypes(format=None))
        print(sf.available_formats())
        print (waveform, sample_rate)
        print(file_.subtype)

magicse avatar Sep 29 '22 13:09 magicse

I can confirm this behavior for this particular file. It does not happen for other files, however. Indeed it does not happen if you save the float64 data as a new MP3 file and try to read that.

I'm afraid this is a libsndfile problem, and there's nothing soundfile can do about it. But please let me know if I'm mistaken on that.

bastibe avatar Sep 30 '22 12:09 bastibe

Hi thank You for Your answer. I've already come across a bunch of mp3 files with the same behavior. And I downloaded these different mp3 files from different sources. Also I even converted the audio file to mp3 from a video editor SonyVegas and got the same problem. Maybe force dtype=float64 until this problem is fixed? Since this often causes problems for libraries (for example librosa, pytorch audio and e.t.c.) that use the soundfile for mp3 loading.

magicse avatar Sep 30 '22 16:09 magicse

https://github.com/libsndfile/libsndfile/issues/880#issuecomment-1264130930

magicse avatar Oct 01 '22 00:10 magicse

That's good to hear! As soon as they libsndfile release a new version (and the build systems catch up, so we can actually use them), we'll push an update to soundfile as well.

bastibe avatar Oct 04 '22 13:10 bastibe

I ran into this issue as well when I realized it was the root cause for a librosa issue. Looking forward to its resolution.

sammlapp avatar Jan 10 '23 15:01 sammlapp

@bastibe Do you have any news for this open issue?

Barabazs avatar Jan 29 '23 14:01 Barabazs

Not yet, sorry. If you want to help, head on over to https://github.com/bastibe/libsndfile-binaries/tree/manylinux-binaries and help me adjust the CI scripts to build updated binaries.

bastibe avatar Jan 30 '23 11:01 bastibe

Not yet, sorry. If you want to help, head on over to https://github.com/bastibe/libsndfile-binaries/tree/manylinux-binaries and help me adjust the CI scripts to build updated binaries.

I just had a look at the branch, but it's not clear what exactly you want adjusted.

I tried bumping the version of libsndfile and manually triggered the GH action. Everything ran smoothly and the binaries were saved as an artifact. Do you want the workflow to automatically commit the new binaries to the branch?

Barabazs avatar Jan 30 '23 12:01 Barabazs

If that's all that it takes, I'll gladly merge a pull request with the new version numbers. Thank you!

Sorry, my life has been terribly busy lately, not much time left for OSS work.

bastibe avatar Jan 30 '23 14:01 bastibe

If that's all that it takes, I'll gladly merge a pull request with the new version numbers. Thank you!

Sorry, my life has been terribly busy lately, not much time left for OSS work.

No need to apologize, I'm happy to help you out. Let's continue here: https://github.com/bastibe/libsndfile-binaries/pull/20

Barabazs avatar Jan 30 '23 18:01 Barabazs

A new release of python-soundfile is in testing now. Please check out https://github.com/bastibe/python-soundfile/pull/364 and see if it fixes your issue.

bastibe avatar Feb 06 '23 13:02 bastibe

I'm trying to use whisperX application that depends on pyannote-audio and underlying torchaudio. Goes without saying soundfile is a dependency. Earlier I had an issue where mp3 files were not being recognized and that was because my soundfile version was out of date. I updated to version 0.12.0 (pyannote-audio currently has the requirement's maximum version to no more than 0.12) and I was able to process mp3s with no issue.

However, most of audio is in the m4a format. Currently I'm converting my audio files to mp3s so that I can process them with whisperx; however, it would be nice to have support for m4as without needed to convert my audio files everytime. Is there anything I can do to help mitigate this?

Infinitay avatar Mar 03 '23 17:03 Infinitay

Format support in soundfile is entirely up to libsndfile. As far as I know, there are currently no plans to support AAC/MP4/M4A audio files in libsndfile, as they seem encumbered by patents.

bastibe avatar Mar 05 '23 08:03 bastibe

Please follow https://github.com/libsndfile/libsndfile/issues/389 for more info on support for AAC in M4A (MP4) containers. Patents should not be an issue (anymore).

cbenhagen avatar Sep 23 '23 10:09 cbenhagen

That's terrific news! Thank you for sharing.

bastibe avatar Sep 24 '23 14:09 bastibe