ray icon indicating copy to clipboard operation
ray copied to clipboard

[RLlib] Move learning_starts logic into execution plans

Open ArturNiederfahrenhorst opened this issue 2 years ago • 2 comments

Why are these changes needed?

  • learning_starts should be renamed to something more descriptive: num_steps_sampled_before_learning_starts

  • Should be moved out of replay buffer config according to our philosophy: Algorithm should define what should happen when, but NOT how it should happen. num_steps_sampled_before_learning_starts answers the "when" and "what" questions and should thus be handled and configured on the top Algo level (not inside replay buffers).

Checks

  • [x] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [ ] Unit tests
    • [ ] Release tests
    • [x] This PR is not tested :(

ArturNiederfahrenhorst avatar Jun 23 '22 10:06 ArturNiederfahrenhorst

can someone rename this pr. This thing goes way beyond renaming a parameter IIUC XD

avnishn avatar Aug 01 '22 20:08 avnishn

@ArturNiederfahrenhorst This PR still needs to address all the TODOs

kouroshHakha avatar Aug 10 '22 06:08 kouroshHakha