returnn
returnn copied to clipboard
Removing redundant lerning_rate(s) parameter
Discussion related to #508
Right now there are both a learning_rate
and a learning_rates
parameter, which often causes confusion to the users.
Here I am not sure what would be the better solution (which one to keep), but the current situation is not good.
They are not exactly redundant or the same. learning_rate
defines the initial and default learning rate.
learing_rates
sets predefined learning rates for some specified epochs.
can the different behaviours be merged into a single variable? maybe something like:
learning_rates = {"initial": FLOAT, "default": FLOAT, "predefined": LIST}
Currently learning_rates
is either a list or lrs or dict epoch -> lr (so a sparse list or lrs).
I'm not sure if extending/changing learning_rates
to also cover initial/default lr is really in any way helpful (or easier) for the user than to just specify it separately. I personally think learning_rate
as a separate option is more clean.
I think @JackTemaki is right in the way, when learning_rates
also specifies the LR of epoch 1. However, I don't see this too much of a problem. And this also does not necessarily need to be the case.
Maybe there can be an extra check (but independent from new behavior) whether learning_rate
has no effect and is different from max(learning_rates.values())
or so, and then just print a warning.
I think I would still leave learning_rate
as a config option.
But then we are still left with learning_rate
and learning_rates
, @JackTemaki wanted to change exactly that. Merging this into a single variable would not take away your options but mainly increase readability since the keywords add information. That was my reasoing..
But then we are still left with
learning_rate
andlearning_rates
,
Yes, this is what I meant. I would prefer to keep both.
Merging this into a single variable would not take away your options but mainly increase readability since the keywords add information.
But what new information? It's exactly the same information, just expressed in a different way (which I think is worse than before). And you even have still the same ambiguity problem, like:
learning_rates = {"initial": 1.0, "default": 0.5, 1: 0.3}
When does it use 'initial', 'default' or the one from epoch 1?
Actually I would also need to read the code to understand which is used in which case exactly. I'm not really sure now, when learning_rates
specifies epoch 1, if learning_rate
is ever used.
Another suggestion:
I would still keep both options, but only allow one of them by the user. If learning_rates
is used, it must define the LR of the first epoch.
I would still keep both options, but only allow one of them by the user. If
learning_rates
is used, it must define the LR of the first epoch.
This sounds like a good compromise. I was not even aware that learning_rates
can have this kind of dict syntax, I guess this should at some point be updated in the docs as well.
Yes, this is what I meant. I would prefer to keep both.
OK.
When does it use 'initial', 'default' or the one from epoch 1?
Good point.
I would still keep both options, but only allow one of them by the user. If learning_rates is used, it must define the LR of the first epoch.
I like this.