keras-contrib
keras-contrib copied to clipboard
Add LearningRateMultiplier wrapper for optimizers
Summary
Optimizer have a model global learning rate. This PR adds a wrapper, which can be used with existing optimizers to provide a facility to specify different learning rates per layers in a network. The per layer learning rate is specified as a factor, which is multiplied with the learning rate of the wrapped optimizer. This wrapper can be used in the following way:
multipliers = {'dense_1': 0.5, 'dense_2': 0.4}
opt = LearningRateMultiplier(SGD, lr_multipliers=multipliers, lr=0.001, momentum=0.9)
The example wrappes SGD and specifies lr
and momentum
for it. The layer which contain the string 'dense_1'
has a multiplier of 0.5
and the layer which contains the string dense_2
has the multiplier of 0.4
.
Different multipliers for kernel and bias can be specified with:
multipliers = {'dense_1/kernel': 0.5, 'dense_1/bias': 0.1}
Related Issues
There are issues regarding this topic in keras https://github.com/keras-team/keras/issues/11934, https://github.com/keras-team/keras/issues/7912 and partially https://github.com/keras-team/keras/issues/5920
It seems there is some pep8 errors and that the code isn't compatible with python 2 because of super() . Super takes two arguments in python 2. Usually it's the class and self
.
You can find out more about the errors by looking at the travis logs.
will there be updates on this? if not can I make a new PR that adds this class to keras-contrib? @gabrieldemarmiesse @stante , will be enabling DiscriminativeLearningRate in general but not specifically only learning rate multiplier.
I propose three settings, automatic learning rate decaying (cosine) from the base learning rate of the wrapped optimizer by layer, automatic learning rate decaying (cosine) from the base learning rate of the wrapped optimizer by convolutional blocks/groups, and this learning rate multiplier
Keras contrib is currently deprecated. Please redicted the PRs to tensorflow/addons. It would be really nice if you could add that @Dicksonchin93 , a lot of people are asking for this feature :)
@gabrieldemarmiesse is there a reason why we shouldn't add this into keras directly?
This was proposed a while back and rejected. The reason is that not enough people use it to justify an API change of Keras. It's also not clear that it's a best practice. Tensorflow addons was made exactly for this kind of feature.
El jue., 9 ene. 2020 a las 19:52, Ee Kin ([email protected]) escribió:
@gabrieldemarmiesse https://github.com/gabrieldemarmiesse is there a reason why we shouldn't add this into keras directly?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-contrib/pull/396?email_source=notifications&email_token=ADCLMK4BPT7YKORCSYQ5VYLQ45W5ZA5CNFSM4GOQCUKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIRLZWA#issuecomment-572701912, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCLMKYTSUBDF44MQUUQLLDQ45W5ZANCNFSM4GOQCUKA .