Christian Schäfer comments

Results 72 comments of


                                            Christian Schäfer

Paper of this repository.

Hi, we unfortunately do not have the (time) resources currently to write papers, the closest would be FastSpeech, which the current implementation is based on: https://arxiv.org/abs/1905.09263 We did some non-scientific...

Paper of this repository.

We will do a comparison soon, hopefully. The latencies in the FastSpeech paper are measured with a batch size of 1, which is an unrealistic setting for production systems, so...

How to save model using torch.jit to run generate on cpu without external code?

Hi, for torchscript it is necessary to do some changes to the model. I will investigate this if i have some spare time.

Fine-tuning pre-trained models

Hi, I tested some fine-tuning using a multispeaker model. Honestly, results were a bit mixed, I would mostly recommend just training on a single corpus for best quality. If data...

How to save model using torch.jit to run generate on cpu without external code?

Hi, the current state is that I implemented a jit-compatible model here: https://github.com/as-ideas/ForwardTacotron/blob/experiments/enable_jit/models/forward_tacotron.py It is taking a bit experimenting though until I decide to merge it to master, feel free...

How to save model using torch.jit to run generate on cpu without external code?

Hi, good news, the jit export is implemented now: https://github.com/as-ideas/ForwardTacotron#export-model-with-torchscript

Questions: Male Voice / Multispeaker

Hi, you could use M-AILABS https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/ for a male speaker. The format is pretty similar. Multispeaker is only possible with a branch (multispeaker). Other languages are not a problem, check...

Questions: Male Voice / Multispeaker

Good luck, I know that the branch is pretty behind but should give an idea about how to do it. Unfortunately I don't have time currently to work on multispeaker.

double_duration_sep

Hi, I am currently experimenting with it. So far, it seems to improve the pauses predicted by the model, which are often gone with the standard implementation. Also, it is...

Feature Request: Objective evaluation metric

Hi, interesting idea - would this be applicable to mel spectra? As far as I understand its more of a metric to compare the final audio wav files, probably more...