picheny-nyu

Results 24 comments of picheny-nyu

Based on what I see in the files it looks like it only ran one iteration and then stopped. It only ran for a few seconds and printed nothing to...

I assume the flags go in the flags file? It claims it does not know the flag "logtostderr" when I place it on command line. Settting iters=100000 makes it do...

Thanks. Could you clarify the following? Some of the TTS pretrained multi speaker models in the model zoo use xvectors. 1. Do they also require kaldi to run? If so,...

The multi speaker TTS model (trained on libritts) in synth_wav.sh above is very slow for waveform generation with the default wavenet model. Do you have a pretrained multispeaker model that...

Can I assume the xvector implementation you used for all models is the above one: http://kaldi-asr.org/models/8/0008_sitw_v2_1a.tar.gz Thanks Michael

I found the different multi-speaker TTS models behave very differently, but did not do a comprehensive study. There are many of them: vctk_gst_tacotron2 vctk_gst_transformer vctk_xvector_tacotron2 vctk_xvector_transformer vctk_xvector_conformer_fastspeech2 vctk_gst+xvector_tacotron2 vctk_gst+xvector_transformer vctk_gst+xvector_conformer_fastspeech2...

- No, I trained all of the models from scratch. I did not use any pre-trained models. - Transformer was from the AMI egs2 recipe; conformer I copied from librispeech....

Thanks. Here is a good reference on MALACH https://www.isca-speech.org/archive_v0/Interspeech_2019/pdfs/1907.pdf

I tried decoding with a beam size of 60. It increased the deletion rate to 30%. Happy to try other methods but I need some parameter recommendations. Also, is there...

Sorry for the long delay in responding. It looks like transformer LMs are not supported for RNNTs: ``` Traceback (most recent call last): File "/ext3/miniconda3/lib/python3.9/runpy.py", line 197, in _run_module_as_main return...