Rafael Valle

Results 133 comments of Rafael Valle

@kurbob does the attention map always look like that? You might have to change from byte to bool https://github.com/NVIDIA/flowtron/blob/master/flowtron.py#L33

@adrianastan There should be **no silence** at the beginning or at the end of an audio file.

@Liujingxiu23 please share training, validations losses and attention for 1 step of flow model and 2 steps of flow model.

Did you warmup the 2 flows model with the 1 flow model from a checkpoint around 200k?

Yes! It's worth trying and can lead to lower validation losses. Let us know what your findings are.

The two samples are different, hence the alignment plots are different.

take a look at the keys for `flowtron_libritts2p3k`. i think we saved the model instead of just the state dict. the code below should solve your issue. ``` ck =...

yes, it is possible to get decent results with the amount of data you have. the closer your speaker is to existing speakers in flowtron_libritts2k the better it will sound....

@shehrum you need to first train with the attention prior enabled and then disable it and resume training once the attention looks good.

We provide a checkpoint for libritts with over 2k speakers. Turn the attention prior to True before training. After training for some time,set it to false once the model has...