pyAudioAnalysis
pyAudioAnalysis copied to clipboard
silenceremoval ValueError: shape mismatch when nChroma.max() > nChroma.shape[0]
Get an error for 8Khz Wav file when I run this simple example. Works for 16Khz recordings.
from pyAudioAnalysis import audioBasicIO as aIO from pyAudioAnalysis import audioSegmentation as aS [Fs, x] = aIO.readAudioFile("recs/Wallet1.wav") #print (Fs) print (x) segments = aS.silenceRemoval(x, Fs, 0.020, 0.020, smoothWindow = 0.6, weight = 0.3, plot = True) #segments = aS.silenceRemoval(x, Fs, 0.020, 0.020)
ValueError Traceback (most recent call last)
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioSegmentation.py in silenceRemoval(x, fs, st_win, st_step, smoothWindow, weight, plot) 646 x = audioBasicIO.stereo2mono(x) 647 st_feats, _ = aF.stFeatureExtraction(x, fs, st_win * fs, --> 648 st_step * fs) 649 650 # Step 2: train binary svm classifier of low vs high energy frames
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioFeatureExtraction.py in stFeatureExtraction(signal, fs, win, step)
590 curFV[n_time_spectral_feats:n_time_spectral_feats+n_mfcc_feats, 0] =
591 stMFCC(X, fbank, n_mfcc_feats).copy() # MFCCs
--> 592 chromaNames, chromaF = stChromaFeatures(X, fs, nChroma, nFreqsPerChroma)
593 curFV[n_time_spectral_feats + n_mfcc_feats:
594 n_time_spectral_feats + n_mfcc_feats + n_chroma_feats - 1] = \
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioFeatureExtraction.py in stChromaFeatures(X, fs, nChroma, nFreqsPerChroma) 269 I = numpy.nonzero(nChroma>nChroma.shape[0])[0][0] 270 C = numpy.zeros((nChroma.shape[0],)) --> 271 C[nChroma[0:I-1]] = spec 272 C /= nFreqsPerChroma 273 finalC = numpy.zeros((12, 1))
ValueError: shape mismatch: value array of shape (80,) could not be broadcast to indexing result of shape (56,)
this does not seem directly related to the sample frequency but to the if statement in stChromaFeatures if nChroma.max()<nChroma.shape[0]:
I had the same problem. I stucked here...
I had the same issue. In my case, I was using silenceRemoval() from audioSegmentation.py.
I solved it by changing st_win and st_step (window size and step in seconds) from 0.020 to 0.040.
This is the line:
self.segments = aS.silenceRemoval(self.audio_x, self.Fs, 0.040, 0.040, weight=0.9, smoothWindow=0.9 , plot=False)
Hope this is useful.