Audio-Denoising icon indicating copy to clipboard operation
Audio-Denoising copied to clipboard

[Question] Working with compressed sounds

Open T145 opened this issue 1 year ago • 4 comments

Hey! So I've tried to denoise an audio file that's been uncompressed from a tight archive using this program and your wavelets-ext example and have the following results:

Original: og

Audio-Denoising: denoise

wavelets-ext: wavelets

Granted this audio doesn't have any background noise and computes an SNR of 100, but to me the denoising performed by this package seems to be an improvement over the original and doesn't have static crackles like the other option. The audio does sound slightly better than the original as well. Why then is this solution so much worse than the other? Is there a different algo I should be try when working with wavelets?

T145 avatar Dec 15 '23 03:12 T145

wavelet-ext uses cython to speed up computation. this repo and ext has few implementation changes and I don't remember which wavelet is being used in denoise but when I used this for my other app, wavelet-ext worked best for me

ghost avatar Dec 15 '23 03:12 ghost

Which app is that? Is it on GitHub?

T145 avatar Dec 15 '23 06:12 T145

torpido

ap-atul avatar Dec 15 '23 06:12 ap-atul

After some testing I did find some cases where the audio got messed up a bit. My primary sound collection is a bunch of mono WAV files that have variable length. Here's my touchup:

import warnings
import numpy as np
import soundfile as sf
import pywt

# https://pywavelets.readthedocs.io/en/latest/index.html
def denoise(in_wav: str, out_wav: str):
	info = sf.info(in_wav)  # getting info of the audio
	rate = info.samplerate

	warnings.simplefilter('ignore')
	warnings.simplefilter('error', RuntimeWarning)
	warnings.simplefilter('error', UserWarning)

	with sf.SoundFile(out_wav, "w", samplerate=rate, channels=info.channels) as of:
		for block in sf.blocks(in_wav, int(rate * info.duration * 0.10)):
			try:
				# Check fixes "UserWarning: Level value of 2 is too high: all coefficients will experience boundary effects."
				# All zero blocks seem safe to ignore since there's no visible spectrogram difference and the audio sounds slightly better.
				# The only concern is that longer pauses may be cut shorter.
				if not np.all(block == 0):
					# Set axis=0 for mono audio
					axis = 0 if info.channels == 1 else -1
					coefficients = pywt.wavedec(block, 'db4', mode='per', level=2, axis=axis)

					# getting variance of the input signal
					sigma = mad(coefficients[- 1])

					# VISU Shrink thresholding by applying the universal threshold proposed by Donoho and Johnstone
					thresh = sigma * np.sqrt(2 * np.log(len(block)))

					# thresholding using the noise threshold generated
					coefficients[1:] = (pywt.threshold(i, value=thresh, mode='soft') for i in coefficients[1:])

					# getting the clean signal as in original form and writing to the file
					clean = pywt.waverec(coefficients, 'db4', mode='per', axis=axis)
					of.write(clean)
				else:
					of.write(block)
			except RuntimeWarning:
				# Caused by "RuntimeWarning: invalid value encountered in divide"
				# more than likely b/c the block is mostly quiet anyway. Write it as-is.
				of.write(block)
			except UserWarning:
				# With the check above, ending here means it's similar to the RuntimeWarning.
				# Therefore write the block as-is.
				of.write(block)

T145 avatar Dec 15 '23 16:12 T145