Josh comments

Results 7 comments of


                                            Josh

Rendering fails with no errors

I know this is about a month late, but it's failing because `-` is an invalid character for a variable in this Mustache library. I can't find any indication of...

A possible approach to pronunciation customization

A couple notes on this based on my own experiences/preferences that you can take or leave: - This is indeed an incredibly helpful addition to a TTS system, IMO necessary...

Some observations after 690k steps

I noticed that `data_utils_old` includes a speaker ID along with the audio data. I don't see any usage of speaker ID in the version of `train` that got committed; was...

Some observations after 690k steps

> As can be seen the model has nowhere to consume speaker id (it does not have an embedding table), it's pointless to pass a speaker id. Yeah, that makes...

Some observations after 690k steps

The first thing I tried was finetuning the model for a couple thousand steps with some different data. I got reasonable results, but nothing groundbreaking. Challenging cases like the one...

Some observations after 690k steps

> accent is part of content. I can tell that's the case here, but it strikes me as a strange definition of the word "content". Content should be _what_ is...

Some observations after 690k steps

If your dataset is sufficiently diverse, I think that kind of pitch issue is inevitable with FreeVC—the pretrained version might seem better because VCTK is less diverse than your data....