Amphion
Amphion copied to clipboard
CosineAnnealingLR
The params of CosineAnnealingLR scheduler in valle_trainer.py seem different with pytorch Docs.
code:
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
self.cfg.train.warmup_steps,
self.optimizer,
eta_min=self.cfg.train.base_lr,
)
pytorch 2.0 Docs:
torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=- 1, verbose=False)
If the T_max should be set as warmup_steps or have another special setting?
Hi, we updated a PR to fix the problem. You can check it! (we use: from diffusers.optimization import get_cosine_schedule_with_warmup)
Hi, we updated a PR to fix the problem. You can check it! (we use: from diffusers.optimization import get_cosine_schedule_with_warmup)
Thanks for the reply. I'm using the NoamScheduler with a base_lr of 0.05 and 2000 warmup steps to train Valle. Have you tested the differences between NoamScheduler and "cosine_schedule_with_warmup"? If so, could you share which is better and the best parameters? Thanks again and looking for your reply.
@HeCheng0625 Please follow up on this issue.
Hi, we haven't test NoamScheduler, I think using AdamW with lr between 5e-5 to 1e-4 and cosine schedule with warmup steps between 5K to 1W steps will give a more stable training process.
Hi, we haven't test NoamScheduler, I think using AdamW with lr between 5e-5 to 1e-4 and cosine schedule with warmup steps between 5K to 1W steps will give a more stable training process.
Thanks for sharing.