autovc Hyperparameters for generating mel spectrogram from training .wav files

Hyperparameters for generating mel spectrogram from training .wav files

Open sroutray opened this issue 6 years ago • 3 comments

Could you please tell us how you generated mel spectrograms for training from .wav files? What were the parameters used?

Aug 27 '19 17:08 sroutray

Aug 28 '19 16:08 auspicious3000

y, sr = librosa.load('p225_001.wav', sr=16000)
S = librosa.feature.melspectrogram(y, sr=16000, n_mels=80, fmin=90, fmax=7600, n_fft=1024, hop_length=256)
S_r0 = 20 * np.log10(np.maximum(1e-5, S))
S_r0 = S_r0 - 16
S_r0 = np.clip((S_r0 + 100.0) / 100.0, 0, 1)
print(np.min(S_r0),np.max(S_r0), S_r0.shape)

waveform = wavegen(model, c=S_r0.T)   
librosa.output.write_wav('test_r0.wav', waveform, sr=16000)

I am using the above code to generate mel spectrogram of the file p225_001.wav. Here, I have used the following parameters: num_mels: 80 fmin: 90 fmax: 7600 fft_size: 1024 hop_size: 256 min_level_db: -100 ref_level_db: 16 But the generated spectrogram is not same as the one provided in metadata.pkl. Also I tried passing both the spectrograms through the wavenet vocoder model provided but the audio generated for my spectrogram is inferior in quality as compared to the audio generated by using the spectrogram in metadata.pkl

Aug 28 '19 18:08 sroutray

#4 see the last few comments

Aug 29 '19 11:08 auspicious3000

autovc autovc copied to clipboard

Hyperparameters for generating mel spectrogram from training .wav files

autovc
autovc copied to clipboard