DeepLearningExamples icon indicating copy to clipboard operation
DeepLearningExamples copied to clipboard

Inconsistent librosa versions PyTorch/SpeechSynthesis/All and CUDA-Optimized/FastSpeech

Open xvdp opened this issue 1 year ago • 0 comments
trafficstars

PyTorch/SpeechSynthesis/All and CUDA-Optimized/FastSpeech

librosa is used through all audio projects although only a few functions. requirements files refer to different versions. But not all syntax is coherent with the versions 'required`.

The main change in librosa > 7 is that many of the functions require kwargs, only positional args allowed are typically the data. e.g. librosa.core.resample(y: 'np.ndarray', *, orig_sr: 'float', target_sr: 'float', .. etc

  1. PyTorch/SpeechSynthesis/ project requirements ask for
  • PyTorch/SpeechSynthesis/Tacotron2/requirements.txt requires librosa
  • PyTorch/SpeechSynthesis/Tacotron2/trtis_cpp/src/trt/requirements.txt librosa==0.7.0
  • PyTorch/SpeechSynthesis/HiFiGAN/requirements.txt librosa==0.9.0
  • PyTorch/SpeechSynthesis/FastPitch/requirements.txt librosa==0.9.0

For consistency they should all require the same version. All but one function - listed below - can run on librosa 10

  1. On the frameworks requiring the newer pytorch, some files use the old syntax.
  • PyTorch/SpeechSynthesis/FastPitch/hifigan/data_function.py line 72 librosa_mel_fn(sampling_rate, n_fft, num_mels, fmin, fmax)
  • PyTorch/SpeechSynthesis/Tacotron2/notebooks/conversationalai/client/speech_ai_demo/utils/jasper/speech_utils.py lines 386 & 389 samples = librosa.core.resample(samples, sample_rate, target_sr) librosa.effects.trim(samples, trim_db)

*CUDA-Optimized/FastSpeech/generate.py uses deprecated librosa.output.write_wav(path, wav, hp.sr) see https://github.com/librosa/librosa/issues/1062

  • CUDA-Optimized/FastSpeech/tacotron2/audio_processing.py line 82 win_sq = librosa_util.pad_center(win_sq, n_fft)

Several of those functions will. It is simple enough to clean the code.

Environment *Driver Version: 535.129.03 *NVIDIA GeForce RTX 3080

  • github cloned over docker image nvidia/cuda:12.1.0-devel-ubuntu22.04

xvdp avatar Jan 16 '24 16:01 xvdp