ray
ray copied to clipboard
[RLlib] Initial design for Ray-Data based offline RL Algos (on new API stack).
Why are these changes needed?
With the new stack we will deprecate the RolloutWorker
which is in the old stack used for sampling from offline data. This PR is instead proposing a simple offline data class that can be used in new stack Offline RL algorithms to sample from offline data based on ray.data.Dataset
.
Related issue number
Checks
- [x] I've signed off every commit(by using the -s flag, i.e.,
git commit -s
) in this PR. - [x] I've run
scripts/format.sh
to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
doc/source/tune/api/
under the corresponding.rst
file.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
- [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [x] Unit tests
- [x] Release tests
- [ ] This PR is not tested :(