Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

Support LR schedulers

That reminds me of the way [optax does schedules](https://optax.readthedocs.io/en/latest/optax-101.html#weight-decay-schedules-and-clipping) which I actually find pretty nice (and am hoping we will follow something similar) . Basically the `learning_rate` (and some other...

Support LR schedulers

> Also the main reason is not only to avoid calling the callable but also to move the responsibility of state keeping one layer above, to the trainer. That makes...

Support LR schedulers

> It only works for whatever the author of the optimizer (and only the optimizer) had in mind to make dynamic. For instance, does it work for the betas values?...

Support LR schedulers

> For instance how would we even log the learning rate. Calling `optimizer.learning_rate` should give the right learning rate for logging? > Would we save the step as an array...

Support LR schedulers

Thanks for the revision. This is a bit inbetween what I was suggesting and what @angeloskath was suggesting. I think if we go with the route of explicit schedule classes...

Support LR schedulers

@postmalloc are you still working on this? Just curious what the plan is as there have been some requests for schedulers :)

Support LR schedulers

> I wasn't actually sure if a consensus had been reached between the approach you suggested and what Makes sense I think it will be easier to criticize the pros/cons...

Support LR schedulers

Hi @postmalloc checking in on this. Are you still working on the PR?

Support LR schedulers

Hi @postmalloc are you planning to work on this PR at all? If not let's close it so we can let someone else tackle / work on schedulers.

Support LR schedulers

The part you shared looks nice, it's pretty simple. One of our optimizers (AdaFactor) already has the step as state, so we'd need to refactor that into the base class....