pyAudioAnalysis icon indicating copy to clipboard operation
pyAudioAnalysis copied to clipboard

silenceremoval ValueError: shape mismatch when nChroma.max() > nChroma.shape[0]

Open giusarno opened this issue 6 years ago • 3 comments

Get an error for 8Khz Wav file when I run this simple example. Works for 16Khz recordings.

from pyAudioAnalysis import audioBasicIO as aIO from pyAudioAnalysis import audioSegmentation as aS [Fs, x] = aIO.readAudioFile("recs/Wallet1.wav") #print (Fs) print (x) segments = aS.silenceRemoval(x, Fs, 0.020, 0.020, smoothWindow = 0.6, weight = 0.3, plot = True) #segments = aS.silenceRemoval(x, Fs, 0.020, 0.020)


ValueError Traceback (most recent call last) in () 4 #print (Fs) 5 print (x) ----> 6 segments = aS.silenceRemoval(x, Fs, 0.020, 0.020, smoothWindow = 0.6, weight = 0.3, plot = True) 7 #segments = aS.silenceRemoval(x, Fs, 0.020, 0.020)

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioSegmentation.py in silenceRemoval(x, fs, st_win, st_step, smoothWindow, weight, plot) 646 x = audioBasicIO.stereo2mono(x) 647 st_feats, _ = aF.stFeatureExtraction(x, fs, st_win * fs, --> 648 st_step * fs) 649 650 # Step 2: train binary svm classifier of low vs high energy frames

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioFeatureExtraction.py in stFeatureExtraction(signal, fs, win, step) 590 curFV[n_time_spectral_feats:n_time_spectral_feats+n_mfcc_feats, 0] =
591 stMFCC(X, fbank, n_mfcc_feats).copy() # MFCCs --> 592 chromaNames, chromaF = stChromaFeatures(X, fs, nChroma, nFreqsPerChroma) 593 curFV[n_time_spectral_feats + n_mfcc_feats: 594 n_time_spectral_feats + n_mfcc_feats + n_chroma_feats - 1] = \

~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/pyAudioAnalysis/audioFeatureExtraction.py in stChromaFeatures(X, fs, nChroma, nFreqsPerChroma) 269 I = numpy.nonzero(nChroma>nChroma.shape[0])[0][0] 270 C = numpy.zeros((nChroma.shape[0],)) --> 271 C[nChroma[0:I-1]] = spec 272 C /= nFreqsPerChroma 273 finalC = numpy.zeros((12, 1))

ValueError: shape mismatch: value array of shape (80,) could not be broadcast to indexing result of shape (56,)

giusarno avatar Jul 12 '19 15:07 giusarno

this does not seem directly related to the sample frequency but to the if statement in stChromaFeatures if nChroma.max()<nChroma.shape[0]:

giusarno avatar Jul 13 '19 20:07 giusarno

I had the same problem. I stucked here...

IvanEvan avatar Jul 23 '19 11:07 IvanEvan

I had the same issue. In my case, I was using silenceRemoval() from audioSegmentation.py.

I solved it by changing st_win and st_step (window size and step in seconds) from 0.020 to 0.040.

This is the line:

self.segments = aS.silenceRemoval(self.audio_x, self.Fs, 0.040, 0.040, weight=0.9, smoothWindow=0.9 , plot=False)

Hope this is useful.

nofi-sys avatar Feb 11 '20 04:02 nofi-sys