wavenet_vocoder
wavenet_vocoder copied to clipboard
Input Data and Targets while training the Wavenet using MOL
Hi, I'm trying to implement TTS from scratch , but while training the wavenet with MOL the loss shoots upto 500K. I think there is a issue with the input and the target data I'm sending in the network. What should be the input to the wavenet while using MOL? Should It be the raw waveform scaled between [-1, 1] or should it be quantized values ranging from 0-256 which are then normalized from [-1, 1]? Right now i am training the wavenet using the raw waveform shifted by 1 and scaled between -1 and 1 and target is also a raw waveform .