NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

[Fix] schedulers with no max_steps param

Open stevehuang52 opened this issue 2 years ago • 3 comments

All LR schedulers in PyTorch do not have the max_steps parameter, so we should not add max_steps to their scheduler_args. Previous code tackle the problem in case-by-case manner, while here we try to solve the problem in a more general way.

To do so, we define a set of schedulers that don't have the max_steps parameter, and set the add_max_args_flag accordingly.

Similarly, we define a set of epoch-based schedulers to set the interval, while previous code only checks for ReduceLROnPlateau scheduler.

Besides, the previous code also has bugs, since it doesn't set the correct add_max_args_flag and interval when if 'args' in scheduler_config is True.

stevehuang52 avatar Jul 19 '22 21:07 stevehuang52

@ericharper for approval

titu1994 avatar Jul 22 '22 22:07 titu1994

Hey so with this pr the I don't need to define max_step param for any scheduler ?

evilc3 avatar Aug 08 '22 11:08 evilc3

Hey so with this pr the I don't need to define max_step param for any scheduler ?

Not really, this PR aims to fix the bug that current code will add a max_steps param to some pytorch schedulers that don't require max_steps.

For those schedulers that do have a max_steps param, current NeMo code will calculate a max_steps if it's not explicitly specified by the user.

stevehuang52 avatar Aug 08 '22 13:08 stevehuang52

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions[bot] avatar Oct 07 '22 02:10 github-actions[bot]