Yike Zhang

Results 8 comments of Yike Zhang

The codes and training data are same for both singe host training and multi hosts training setups.

Here are some details: Training data is about 2000 hours. hyper-parameters are: lr: 0.01 lr_batches: 5000 lr_epochs: 3.5 1) on a singe host with 8GPUs, losses are: epoch 1 cv_loss:...

> losses on two hosts are: I use the architecture of hybrid CTC and AED, the CTC weight is 0.2

Thanks for your advices. I will checkout whether parameters on different hosts are same first as well as adjust the warn up settings. In addition, I do not use fp16...

> You must be normalizing the loss differently from us? Because normally our losses are around 0.1 or less. If it is failing to discover the alignment, it could be...

> Thank you for your suggestions. I found the reason why Zipformer cannot converge trained with 16 GPUs on my dataset. It is due to the model warmup setting. ```python...

我遇到了相同的问题,音频开头部分很大概率会出现一个冲击噪音

> 用cosyvoice-25hz-sft来使用预训练音色 CosyVoice1模型开头位置经常有“滴”的一声杂音,CosyVoice2模型的杂音比例明显降低了,请问是做了什么优化呢?和flow的模型结构有关系吗?