ramyoogesh
ramyoogesh
Hi, I tried multi speaker Fastpitch for 2200 Epochs (2 speaker M, F each having 15,000 sentences per speaker , total of 50 hours of voice data). But the output...
Thank you ! I'll check it out. Just to clarify, Did you use external speaker embedding because the output from FastPitch was **purely noise** or did you use it to...
Thanks @alancucki & @adrianastan. I retrained FastPitch (2 speakers: 13,000 sentences per speaker, 4500 epochs) and the output is still purely noise [**Debug**: I tried loading different checkpoints and it...
- @adrianastan It's just noise-noise. Yes, symbol list is same during Inference & training. Yes, transcription is aligned as well. - What were the audio pre-processing parameters that you used?...
Hi @adrianastan , thank you. I trained Multi-Speaker Fastpitch with the same dataset (Issue was related to pre-processing, downsampling using ffmpeg rather than using librosa has solved it). Also when...