hifi-gan icon indicating copy to clipboard operation
hifi-gan copied to clipboard

ExponentialLR and fine tune

Open Alexey322 opened this issue 4 years ago • 3 comments

Hi. I started training the model from scratch and found that the optimizer uses a dynamic learning step. If I train the model with 2.5 million steps, then according to my calculations, the training step will drop to 3e-7. Not only is this a very low learning step that can cause floating point errors, it is also impossible to adapt other speakers at such a checkpoint, because the learning step is too small. Does this mean that it is better to set lr_decay = 1.0?

Alexey322 avatar Dec 26 '20 07:12 Alexey322

Hi. In our experiments, we observed that the learning rate decay affects stable quality improvement. I would like to recommend adjusting the learning rate loaded from a checkpoint during transfer learning.

jik876 avatar Dec 26 '20 09:12 jik876

@jik876 I'm not sure if this is the right thing to do, because during training with a low learning rate, the minimum is looking for smaller areas. When we change the learning rate, to a higher one from a lower one, the minimum search algorithm will search for a minimum in wider areas, respectively, everything that the model has learned at a smaller learning step will be forgotten. It seems to me that it is more correct to train at a fixed learning rate, and then adapt the speakers on this model with a dynamic learning rate. Maybe I'm wrong. What do you think?

Alexey322 avatar Dec 26 '20 09:12 Alexey322

If the purpose of training the source speaker is to make a basis for transfer learning to another speaker, then the method you mentioned may be good. If you're going to use the source speaker model as it is, refer to our experiments, using the learning rate decay leads to better quality. The choice of learning rate in transfer learning will be affected by the difference between the source speaker and the target speaker. If the speech characteristics of the two speakers are not very different, using a small learning rate in the training will yield quite good results, and vice versa.

jik876 avatar Jan 03 '21 05:01 jik876