ray
ray copied to clipboard
[RLlib] Moving sampling coordination for `batch_mode=complete_episodes` to `synchronous_parallel_sample`.
Why are these changes needed
When sampling complete episodes each EnvRunner
sampled train_batch_size
before returning. This made sampling inefficient and led to long waiting times in case slow environments were used. Furthermore, scaling could not reduce the workload in sampling. This PR changes this and moves coordination of the sampling when complete_episodes
are needed fully to synchronous_parallel_sample
that can coordinate better across all EnvRunner
s. This should reduce sampling duration linearly by the number of EnvRunner
s chosen.
Related issue number
Closes #45826
Checks
- [x] I've signed off every commit(by using the -s flag, i.e.,
git commit -s
) in this PR. - [x] I've run
scripts/format.sh
to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
doc/source/tune/api/
under the corresponding.rst
file.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
- [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [x] Unit tests
- [x] Release tests
- [ ] This PR is not tested :(