ray icon indicating copy to clipboard operation
ray copied to clipboard

Adjust the TuneCallback class for Pytorch-lightning v1.6+ (removing o…

Open davzaman opened this issue 2 years ago • 3 comments

Why are these changes needed?

This is a duplicate of #27766 but with the changes that were supposed to be there (accidentally made changes while the PR was being accepted while trying to change the commits to have the sign off).

Getting a ton of warnings from pytorch-lightning about deprecation of certain hooks. The TuneCallback has old hooks such as keyboard_interrupt which have been deprecated. I updated the callback to the new hooks and updated the integration tests.

The deprecated hooks and their new versions (In pl v1.6+) are:

  • on_keyboard_interrupt -> on_exception
  • on_init_start -> just deprecated no new version
  • on_init_end -> just deprecated no new version
  • on_batch_start -> on_train_batch_start
  • on_batch_end -> on_train_batch_end
  • on_epoch_end -> on_<train/validation/test>_epoch_end
  • on_epoch_start -> on_<train/validation/test>_epoch_start

Related issue number

Closes #27487

Checks

  • [x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [x] I've run scripts/format.sh to lint the changes in this PR.
  • [x] I've included any doc changes needed for https://docs.ray.io/en/master/.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/

I spent a few hours trying to wrangle my environment. Unfortunately I'm on windows using WSL and have an old version of Debian for some reason which only supports gcc-6 (not even close to gcc-9 which is required for building the project so I can run the tests). Given this roadblock and the fact that the fixes were minimal and simple I decided to push out the changes.

  • Testing Strategy
    • [x] Unit tests
    • [ ] Release tests
    • [ ] This PR is not tested :(

davzaman avatar Aug 11 '22 01:08 davzaman

Thanks @davzaman for the contribution and @matthewdeng for the initial review!

Ping @richardliaw for the question "should we ensure backwards compatible support for PTL < 1.6?"

zhe-thoughts avatar Aug 19 '22 20:08 zhe-thoughts

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

stale[bot] avatar Sep 22 '22 01:09 stale[bot]

@richardliaw Does everything look good to go?

davzaman avatar Sep 22 '22 01:09 davzaman

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

stale[bot] avatar Oct 08 '22 08:10 stale[bot]

Sorry it took me a while to get to this. I'm seeing @amogkam's commit 60cde113ee16f9bd1d17c210a22fb3329283e034 from a PR #30045 that uses a different solution (dynamic rather than using a flag to decide on using one set of hooks over the other). In that case, the changes I made have already been integrated in a sense. The only thing missing is the update to https://github.com/ray-project/ray/blob/master/python/ray/tests/ray_lightning/simple_tune.py as you mentioned, as well as rerunning https://github.com/ray-project/ray/blob/master/doc/source/tune/examples/tune-pytorch-lightning.ipynb to regenerate the outputs without the deprecation warnings.

In that case, how would you like to proceed? We can probably close this pr and just add a commit for the remaining couple of things?

davzaman avatar Nov 10 '22 23:11 davzaman

Checking in to see your thoughts re: my previous comment @matthewdeng :)

davzaman avatar Dec 08 '22 23:12 davzaman

@davzaman yeah that sounds great! Closing this PR and creating a new one with the remaining changes makes sense to me.

matthewdeng avatar Dec 13 '22 01:12 matthewdeng

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

stale[bot] avatar Jan 15 '23 16:01 stale[bot]

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

stale[bot] avatar Feb 02 '23 02:02 stale[bot]

Hi @davzaman , I use your modification and run the test, but get the following error, do you have any suggestions? TypeError: on_exception() takes 3 positional arguments but 4 were given

Minxiangliu avatar Feb 14 '23 03:02 Minxiangliu