ray
ray copied to clipboard
[RLlib] Make Learner more standalone with regards to LearnerHyperparameters
Description
Today, LearnerHyperparameters, AlgorithmConfig and Learner relate as follows:
- PPOLearner needs
PPOLearnerHyperparameters. LearnerHyperparametersdoes not have sensible (only Nones there).- An instance of
PPOLearnerHyperparameterswith sensible defaults can only be gotten from thePPOConfig. - When instantiating
PPOLearnerwithout providing this argument, it defaults toLearnerHyperparameters(). - This default is not helpful because it lacks HPs to run PPOLearner.
- Instead, we should default to
PPOLearnerHyperparameters()insidePPOLearner. - Because of 6
PPOLearnerHyperparameters()must have sensible defaults without 3) - Because of 7 AlgorithmConfig should get it's defaults from PPOLearnerHyperparameters.
Use case
No response
This P2 issue has seen no activity in the past 2 years. It will be closed in 2 weeks as part of ongoing cleanup efforts.
Please comment and remove the pending-cleanup label if you believe this issue should remain open.
Thanks for contributing to Ray!