Rishikesh (ऋषिकेश) comments

Results 160 comments of


                                            Rishikesh (ऋषिकेश)

trafficstars

Wrong Metrics

Speed of a GPU mostly depends on program optimisation, GPU architecture, Memory clock, Type of Memory (Not in memory size), Memory Bandwidth ,PCIe bandwidth, Number of CUDA cores for parallel...

Hi @keonlee9420 , DelightfulTTS is similar to [Phone Level Mixture Density Network](https://github.com/rishikksh20/Phone-Level-Mixture-Density-Network-for-TTS) but here instead of using complicated GMM based model author directly used latent representation for Prosody Predictor and...

New TTS Model request

DelightfulTTS learn Phoneme level prosody implicitly whereas `Emphasis control for parallel neural TTS` learn same explicitly by extracting features from this [repo](https://github.com/asuni/wavelet_prosody_toolkit).

New TTS Model request

I think DelightfulTTS is all in one solution, it uses non-autoregressive architecture with conformer blocks and both Utterance level and Phoneme level predictor as well.

New TTS Model request

@keonlee9420 Hi, are you able to train DelightfullTTS successfully ?

New TTS Model request

Have you train predictor and extractor simultaneously or train extractor for 100k steps first then pause it and then start predictor training in teacher forcing method like mentioned in AdaSpeech...

New TTS Model request

Because in my case I do some modification in architecture, I used same extractors as mentioned in DelightfullTTS 's papers but I am not using any predictor for utterance level...

New TTS Model request

I suggest 1

New TTS Model request

@keonlee9420 In your experience which perform better normal Transformer encoder or Conformer when you have only 20 hours of speech data?

New TTS Model request

As per this [article](https://www.microsoft.com/en-us/research/blog/azure-ai-milestone-new-neural-text-to-speech-models-more-closely-mirror-natural-speech/) Microsoft TTS api built on DelightfullTTS.