optax
optax copied to clipboard
Add sophia-h optimizer
PR to add sophia optimizer. It's mostly based on levanter's implementation with some changes/added features here and there.
One note is that I had to change the contrib common test file a couple times, once to pass the loss_fn out of the parabola and rosenbrock functions (could be useful later for other optimizers that need loss function), and a second time to bypass the check for update arguments to be values (the loss function is not). Please advise if these changes are not ok or the most correct.
fixes #968
Hi Vincent, thank you for the notes! They all make perfect sense to me and I'll get to updating the code/answering them tomorrow
@evanatyourservice please ping us whenever you're ready for another round of reviews :-)
@fabianp Will do! Sorry been moving but will try to get this going asap
there's no rush, just wanted to make sure you were not waiting on us :-)
@vroulet @fabianp Got some updates pushed, let me know if anything needs to be changed! Thanks
Hello @evanatyourservice, Sorry for the very long delay on our end. I think your code looks great! If it's ok with you, can you merge the code with head once #1060 is merged (#1060 adds more tests for the contrib optimizers to ensure compatibilities). Then I should approve and finish on our side if there are still minor details to fine-tune. Thank you again!
sounds good!
Hello @evanatyourservice, #1060 got merged. You may merge the tests, address Fabian's comment, and I can approve (and maybe fix minor issues on our end if there are). Thank you again!
Ok sounds good! Sorry I should get to this tomorrow
@fabianp @vroulet wow ok 1 day turned into 3 weeks, let me revert rademacher to normal and make sure the tests are all good and I'll push update :)
gentle ping @evanatyourservice :-)
no rush but it would be a really nice feature to have landed in optax ❤️
Thank you again @evanatyourservice ! We added tests and made some final fixes necessary for the tests to pass.
Awesome! Thank you for finishing it up! I got kinda stumped at some of the tests and was debating another strategy but then it fell out of my mind for other tasks. Glad we were able to get it merged!