Shivam Mehta

Results 47 comments of Shivam Mehta

Is your LLM frozen or are you training any aspect of it?

I think since your input is not text but already some representation that should be able to capture hidden nuances of phonemization, it should be fine. It is definitely an...

Then, I would have to believe that the hidden representations might not capture what is required to synthesise speech. I am not sure what would be an easy fix to...

We used a very similar experiment setup to Grad-TTS, and we used their optimizer settings, however, I encourage you to try some decay and experiment with it.

It logs mel spectrogram, however if you need the entire waveform you will need to load a vocoder or have Griffin-Lim running during training in order to log it. You...

Hopefully this is resolved, I am closing the issue for now. Feel free to reopen incase of further follow up on this issue.

That is a good suggestion, I will implement it. However, due to other commitments right now. I am having difficulty actively maintaining the package, but I accept PRs. Would love...