ray
ray copied to clipboard
[RLlib; Offline RL] - Enable reading old-stack `SampleBatch` data in new stack Offline RL.
Why are these changes needed?
Right now the new Offline RL stack does not allow using old stack record data. Many users have costly recorded data from the old stack (i.e. in SampleBatch
format). This PR proposes an option to read old stack SampleBatch
recordings via the OfflinePreLearner
. It does come in its actual form to some limitations, which might be removed in future PRs:
- If there are multiple episodes inside of single
SampleBatch
es recorded, the data wil be packed into a singleSingleAgentEpisode
. - If
tain_batch_size_per_learner
is defined, this argument defines the number ofSampleBatch
es pulled from the offline data per training iteration and NOT the agent/env steps recorded. For example atrain_batch_size_per_learner=2000
and recordedSampleBatch
es with200
agent steps inside of each batch would result in an actual training batch of2000 * 200
agent/env steps.
Related issue number
Checks
- [x] I've signed off every commit(by using the -s flag, i.e.,
git commit -s
) in this PR. - [x] I've run
scripts/format.sh
to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
doc/source/tune/api/
under the corresponding.rst
file.
- [ ] I've added any new APIs to the API Reference. For example, if I added a
method in Tune, I've added it in
- [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
- [x] Unit tests
- [x] Release tests
- [ ] This PR is not tested :(