ray [RLlib; Offline RL] - Enable reading old-stack `SampleBatch` data in new stack Offline RL.

[RLlib; Offline RL] - Enable reading old-stack `SampleBatch` data in new stack Offline RL.

Open simonsays1980 opened this issue 6 months ago • 0 comments

Why are these changes needed?

Right now the new Offline RL stack does not allow using old stack record data. Many users have costly recorded data from the old stack (i.e. in SampleBatch format). This PR proposes an option to read old stack SampleBatch recordings via the OfflinePreLearner. It does come in its actual form to some limitations, which might be removed in future PRs:

If there are multiple episodes inside of single SampleBatches recorded, the data wil be packed into a single SingleAgentEpisode.
If tain_batch_size_per_learner is defined, this argument defines the number of SampleBatches pulled from the offline data per training iteration and NOT the agent/env steps recorded. For example a train_batch_size_per_learner=2000 and recorded SampleBatches with 200 agent steps inside of each batch would result in an actual training batch of 2000 * 200 agent/env steps.

Related issue number

Checks

[x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
[x] I've run scripts/format.sh to lint the changes in this PR.
[x] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in doc/source/tune/api/ under the corresponding .rst file.
[x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- [x] Unit tests
- [x] Release tests
- [ ] This PR is not tested :(

Aug 27 '24 16:08 simonsays1980

ray ray copied to clipboard

[RLlib; Offline RL] - Enable reading old-stack `SampleBatch` data in new stack Offline RL.

Why are these changes needed?

Related issue number

Checks

ray
ray copied to clipboard