ray icon indicating copy to clipboard operation
ray copied to clipboard

[RLlib] Make Learner more standalone with regards to LearnerHyperparameters

Open ArturNiederfahrenhorst opened this issue 2 years ago • 1 comments

Description

Today, LearnerHyperparameters, AlgorithmConfig and Learner relate as follows:

  1. PPOLearner needs PPOLearnerHyperparameters.
  2. LearnerHyperparameters does not have sensible (only Nones there).
  3. An instance of PPOLearnerHyperparameters with sensible defaults can only be gotten from the PPOConfig.
  4. When instantiating PPOLearner without providing this argument, it defaults to LearnerHyperparameters().
  5. This default is not helpful because it lacks HPs to run PPOLearner.
  6. Instead, we should default to PPOLearnerHyperparameters() inside PPOLearner.
  7. Because of 6 PPOLearnerHyperparameters() must have sensible defaults without 3)
  8. Because of 7 AlgorithmConfig should get it's defaults from PPOLearnerHyperparameters.

Use case

No response

ArturNiederfahrenhorst avatar May 25 '23 19:05 ArturNiederfahrenhorst

This P2 issue has seen no activity in the past 2 years. It will be closed in 2 weeks as part of ongoing cleanup efforts.

Please comment and remove the pending-cleanup label if you believe this issue should remain open.

Thanks for contributing to Ray!

cszhu avatar Jun 17 '25 00:06 cszhu