picheny-nyu comments

Results 24 comments of


                                            picheny-nyu

Tutorial Example seems to Fail?

Based on what I see in the files it looks like it only ran one iteration and then stopped. It only ran for a few seconds and printed nothing to...

Tutorial Example seems to Fail?

I assume the flags go in the flags file? It claims it does not know the flag "logtostderr" when I place it on command line. Settting iters=100000 makes it do...

Use of Multi-speaker TTS system based on x-vector?

Thanks. Could you clarify the following? Some of the TTS pretrained multi speaker models in the model zoo use xvectors. 1. Do they also require kaldi to run? If so,...

Use of Multi-speaker TTS system based on x-vector?

The multi speaker TTS model (trained on libritts) in synth_wav.sh above is very slow for waveform generation with the default wavenet model. Do you have a pretrained multispeaker model that...

Use of Multi-speaker TTS system based on x-vector?

Can I assume the xvector implementation you used for all models is the above one: http://kaldi-asr.org/models/8/0008_sitw_v2_1a.tar.gz Thanks Michael

Use of Multi-speaker TTS system based on x-vector?

I found the different multi-speaker TTS models behave very differently, but did not do a comprehensive study. There are many of them: vctk_gst_tacotron2 vctk_gst_transformer vctk_xvector_tacotron2 vctk_xvector_transformer vctk_xvector_conformer_fastspeech2 vctk_gst+xvector_tacotron2 vctk_gst+xvector_transformer vctk_gst+xvector_conformer_fastspeech2...

picheny-nyu

Tutorial Example seems to Fail?

Tutorial Example seems to Fail?

Use of Multi-speaker TTS system based on x-vector?

Use of Multi-speaker TTS system based on x-vector?

Use of Multi-speaker TTS system based on x-vector?

Use of Multi-speaker TTS system based on x-vector?

RNN-T Decoding - Large Number of Deletions Compared to Transformer/Conformer

RNN-T Decoding - Large Number of Deletions Compared to Transformer/Conformer

RNN-T Decoding - Large Number of Deletions Compared to Transformer/Conformer

RNN-T Decoding - Large Number of Deletions Compared to Transformer/Conformer