vall-e issues

After 100 epochs training, the model can synthesize natural speech on LibriTTS

68

I trained vall-e on LibriTTS about 100 epochs (took almost 4 days on 8 A100 GPUs) and I obtained plausible synthesized audio. Here is a demo. [1] prompt : [prompt_link](https://drive.google.com/file/d/149pHqb6TZzVwhF1vRN50H8A4AEYShpfp/view?usp=share_link)...

dohe0342

Could you please provide me with the specific parameter configurations in the command for training the LJSpeech dataset?

Could you please provide me with the specific parameter configurations in the command for training the LJSpeech dataset? Like this： python3 bin/trainer.py --max-duration 80 --filter-min-duration 0.5 --filter-max-duration 14 --train-stage 1...

mumuyeye

compute_and_store_features_batch OOM

So when I tried to tokenized the wenetspeech, I got RuntimeError: CUDA out of memory. Is there any possible for on-the-fly?

OswaldoBornemann

Training on wenetspeech dataset

File "/home/twlan/anaconda3/envs/valle/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/twlan/anaconda3/envs/valle/lib/python3.8/site-packages/encodec/modules/seanet.py", line 63, in forward return self.shortcut(x) + self.block(x) File "/home/twlan/anaconda3/envs/valle/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/twlan/anaconda3/envs/valle/lib/python3.8/site-packages/torch/nn/modules/module.py",...

codeking233

train loss of custom data

2

Hi, I tried to train on my dataset, but I seem to have an abnormal loss curve. Do you have any suggestions? Thanks. ----------------------------------------------------- The loss of AR: https://drive.google.com/file/d/1-gZJX-mwYZ-2vkKTl0dTwBcp1A8MHrmV/view?usp=drive_link ![image](https://github.com/lifeiteng/vall-e/assets/37279265/11e78b66-f71c-4039-b495-73b191fac760)...

Wangzhen-kris

Failed during inference [SyntaxError: well trained model shouldn't reach here.]

2

I get an error like this: ``` 2023-10-19 10:10:09,510 INFO [infer.py:224] synthesize text: Selamat pagi 2023-10-19 10:10:09,513 WARNING [words_mismatch.py:88] words count mismatch on 500.0% of the lines (5/1) 2023-10-19 10:10:09,516...

kin0303

FYI: Our WeChat group (扫码加入新一代 Speech 微信交流群)

15

FYI: I build one WeChat group for discussing various new speech technologies. Those who are interested can scan the following QR codes with your [WeChat app](https://www.wechat.com/) to join the group....

lifeiteng

Training on a custom dataset

3

Hello, I was reading the training instructions (and the prepare dataset scripts) and I don't understand how you'd create and use custom datasets with this model.

korakoe

how to train model with deepspeed

as title

kingmpw2015

Training result

2

I'd like to inquire about the training results. I have combined datasets AISHELL3, aidata, and a Chinese dataset, totaling 600 hours of training. Although the three audio files are not...

yiwei0730

vall-e
vall-e copied to clipboard

Metadata

After 100 epochs training, the model can synthesize natural speech on LibriTTS

Could you please provide me with the specific parameter configurations in the command for training the LJSpeech dataset?

compute_and_store_features_batch OOM

Training on wenetspeech dataset

train loss of custom data

Failed during inference [SyntaxError: well trained model shouldn't reach here.]

FYI: Our WeChat group (扫码加入新一代 Speech 微信交流群)

Training on a custom dataset

how to train model with deepspeed

Training result

← Metadata

Owner

Metadata

vall-e vall-e copied to clipboard

Metadata

← Metadata

Owner

Metadata

vall-e
vall-e copied to clipboard