python-soundfile icon indicating copy to clipboard operation
python-soundfile copied to clipboard

Converting Wav to Flac introduces strange artifacts.

Open JohnTravolski opened this issue 5 years ago • 4 comments
trafficstars

I started using soundfile to convert .wav files to .flac files, but on occasion, it seems to introduce undesirable artifacts. For example:

data, samprate = soundfile.read(filename)
soundfile.write(filename.replace(".wav",".flac"), data, samplerate = samprate, format = "FLAC")

results in a flac file with a few popping noises. Here's a video showcasing the waveform: https://www.youtube.com/watch?v=1Yc20C8FLwk&feature=youtu.be

Here's a pictorial comparison: test wav test flac

test.wav is the source .wav file. Lights.flac was the original .wav converted to .flac using Adobe Audition. out.flac was the original .wav converted to .flac using ffmpeg. test.flac was the original .wav converted to .flac using soundfile (the code above).

As you can see, there are some parts of the song that have additional green sections of the waveform introduced (these are the popping noises that I hear). I don't know what's causing this. I'm on Windows 10 and Python 3.8.2. Same issue in Python 3.7.0. The original .wav file I used for this can be downloaded here: https://soundcloud.com/yumecollective/kamska-lights-yume?in=yumecollective/sets/kamska-lights-yume

JohnTravolski avatar Apr 23 '20 18:04 JohnTravolski

Does this happen only when converting to FLAC, or does it affect other formats as well?

If it only happens with FLAC, I am inclined to say that it is caused by the underlying C library libsndfile.

bastibe avatar Apr 24 '20 08:04 bastibe

I tried it again with AIFF and there were no artifacts. The problem seems to be exclusive to FLAC, although I did not try every other format.

I recall having the same issue when I used pydub with libav, but I don't know if that's relevant.

I also found this, but I do not know if this is related. Please let me know what you think. https://github.com/erikd/libsndfile/issues/504

Please let me know if you find out anything else based on this information. Thanks.

JohnTravolski avatar May 02 '20 15:05 JohnTravolski

Could be the same issue. I don't know. If so, a cursory reading of the issue seems to suggest that flac has issues with data <= -1 or > 1. Is that the case for you?

Either way, though, it is unlikely that there is a solution to the problem this side of libsndfile. Sorry.

bastibe avatar May 04 '20 13:05 bastibe

Is there any update on this bug with FLAC audio file I/O in soundfile?

I just got bitten by this one too: Once in a while, I observed very strange blocks of corrupted sound samples in .flac files written by soundfile: all_detail (top is .wav and bottom is .flac, written using soundfile from exact same audio samples)

I then found this issue comment: https://github.com/libsndfile/libsndfile/issues/504#issuecomment-570667456 and I believe this is exactly what I am observing, since at the moment where that corruption happens, there is indeed a value of -1.0 in the sound signal in the Python array. If I set that sample to -32767/32768, or if I scale my entire signal with a factor 32767/32768 (in both cases eliminating the -1.0 value), the issue disappears.

I think I would prefer if the occasional -1.0 values would end up being -32767/32768 (thus no longer having the ability to have a perfect round-trip), rather than experiencing this sound corruption for a block of several samples in the written FLAC file. Unless there is a proper fix for this already with a newer version of libsndfile, of course (I had this with soundfile.version == '0.10.3').

KoenT-SS avatar Jul 20 '22 14:07 KoenT-SS