keras-adabound icon indicating copy to clipboard operation
keras-adabound copied to clipboard

about lr

Open tanakataiki opened this issue 6 years ago • 1 comments
trafficstars

Thanks for a good optimizer According to usage optm = AdaBound(lr=1e-03, final_lr=0.1, gamma=1e-03, weight_decay=0., amsbound=False) Does the learning rate gradually increase by the number of steps?


final lr is described as Final learning rate. but it actually is leaning rate relative to base lr and current klearning rate? https://github.com/titu1994/keras-adabound/blob/5ce819b6ca1cd95e32d62e268bd2e0c99c069fe8/adabound.py#L72

tanakataiki avatar Mar 19 '19 04:03 tanakataiki

Final lr is approximately after 1/ gamma update steps have occurred. At this point, the clipping bounds are somewhat tight and cause the actual lr to fall close to the final lr after clipping.

In the initial updates though, the LR bounds are in the range of the initial lr so it allows for Adam type updates.

This means that if you use this optimizer on dataset for a task that SGD can't do well on (but Adam can), then this optimizer will get worse results than Adam alone. At least that's what I've experienced on Language modelling tasks.

titu1994 avatar Mar 19 '19 05:03 titu1994