audio-reactive-led-strip
audio-reactive-led-strip copied to clipboard
Performance issue : gaussian_filter1d
Context
I studied a lot this part of code (visualization.py:206-222):
# Transform audio input into the frequency domain
N = len(y_data)
N_zeros = 2**int(np.ceil(np.log2(N))) - N
# Pad with zeros until the next power of two
y_data *= fft_window
y_padded = np.pad(y_data, (0, N_zeros), mode='constant')
YS = np.abs(np.fft.rfft(y_padded)[:N // 2])
# Construct a Mel filterbank from the FFT data
mel = np.atleast_2d(YS).T * dsp.mel_y.T
# Scale data to values more suitable for visualization
# mel = np.sum(mel, axis=0)
mel = np.sum(mel, axis=0)
mel = mel**2.0
# Gain normalization
mel_gain.update(np.max(gaussian_filter1d(mel, sigma=1.0)))
mel /= mel_gain.value
mel = mel_smoothing.update(mel)
And I tried to profile each line of this code.
I discovered that the call to gaussian_filter1d represents 40% of the time of this piece of code, while mel variable is an 1D array with only 24 items (!).
gaussian_filter1d performance is poor
My guess : gaussian_filter1d needs some heavy precomputing to elaborate filter coefficients, but at every call this heavy computing is done again and again.
As gaussian_filter1d is a linear function (gaussian_filter1d(a+b) == gaussian_filter1d(a) + gaussian_filter1d(b)), it's possible to extract filter coefficients and fill a cache for future use.
So I suggest the following class :
class GaussianFilter1D():
def __init__(self, size, sigma):
self._arr = gaussian_filter1d( np.identity( size ), sigma = sigma )
def filter(self, x):
if x.ndim == 1:
return np.atleast_2d( x ).dot( self._arr )[0]
else:
return x.dot( self._arr )
Bench results:
Before:
>>> mel = np.random.rand(24)
>>> ref = time.time()
>>> for _ in range(100000):
... _ = gaussian_filter1d( mel, sigma = 1.0 )
...
>>> print( time.time()-ref )
16.78048849105835
After:
>>> mel = np.random.rand(24)
>>> g = GaussianFilter1D( mel.shape[-1], sigma = 1.0 )
>>> ref = time.time()
>>> for _ in range(100000):
... _ = g.filter( mel )
...
>>> print( time.time()-ref )
0.577545166015625
Conclusion : Speed x29
Note : gaussian_filter1d is used several times in this project : in MEL computing and in visualize_* functions.
Would you be willing to submit a PR for this?
Any news on this ? I am running 502 LEDs in total but limited the # to 256 in case of performance issues on a RP3B. I am looking for any performance boost I can have to get rid of that lags.... Thanks for all the support, You all do an outstanding job :-)