[AIR] `on_trial_complete` callback hook happens before trial resources are freed
Description
When implementing a custom Callback, users may do some long computation on_trial_complete. This happens before the actors associated with this trial get killed. It might make sense to call this hook after the trial has been stopped.
https://github.com/ray-project/ray/blob/8e49d2aa54426842430f70c7060c3c6ac0f513f3/python/ray/tune/execution/trial_runner.py#L692-L710
In general, what should callbacks be recommended for?
- Having long computation on ANY callback hook would lead to the Tune main event loop being blocked. This prevents other trials from progressing.
- In the current state, callbacks are mostly meant to read some state and not do any significant work.
- Callbacks that need to do work must do it in a separate process, which needs to be implemented by the callback itself. See
SyncerCallbackfor an example.
Use case
No response
This P2 issue has seen no activity in the past 2 years. It will be closed in 2 weeks as part of ongoing cleanup efforts.
Please comment and remove the pending-cleanup label if you believe this issue should remain open.
Thanks for contributing to Ray!