Aaron (Yinghao) Li

Results 110 comments of Aaron (Yinghao) Li

Unfortunately it seems like a bug. I took the data directly from VITS repo (https://github.com/jaywalnut310/vits/blob/main/filelists/ljs_audio_text_test_filelist.txt.cleaned) without any scrutinization. @Kreevoz I guess you are correct😂. I just tested it and the...

Maybe I'll redo the preprocessing of LJSpeech dataset and train a new model with corrected data file when I get time.

> Unfortunately it seems like a bug. I took the data directly from VITS repo (https://github.com/jaywalnut310/vits/blob/main/filelists/ljs_audio_text_test_filelist.txt.cleaned) without any scrutinization. > > @Kreevoz I guess you are correct😂. I just tested...

@Kreevoz I found another problem. The quote in the LibriTTS dataset was actually `"content"`, not ` ``content'' `: https://raw.githubusercontent.com/yl4579/StyleTTS2/main/Data/OOD_texts.txt, so the inference code for sentences with quotes is also wrong.

The online API is not working right now. If it’s different though, since I’m running inference on A40, how do I get it working in the same way as the...

I just checked the output and I'm pretty sure the default model produces output very similar to `13B (Beta)` in the huggingface space (though down now). How do I get...

Now I have confirmed they give similar response, but the response is different from those I got a month ago (around early Nov). Did you change the model for your...

In your experience which one is better? I changed to `eval_mdl_path = '../../pretrained_mdls/ltu_ori_paper.bin'` but got the following error: ``` RuntimeError Traceback (most recent call last) Cell In[3], line 50 47...

Thanks for your question. This was intentional. The masked indices are used for loss calculation here: https://github.com/yl4579/PL-BERT/blob/main/train.ipynb (see the `if len(_masked_indices) > 0:` line), so the masked token also includes...

@tekinek The token separator doesn't need to be predicted because it has a one-to-one correspondence between the grapheme and phoneme (i.e., the space token in the phoneme domain always corresponds...