mbrl-lib
mbrl-lib copied to clipboard
[Feature Request] Training data selection: Create more "interesting" Replay Buffer Iterators
🚀 Feature Request
Create Replay Buffer Iterators that can select training and validation data in various "interesting" ways, similar to TransitionIterator
and BootStrapIterator
in
https://github.com/facebookresearch/mbrl-lib/blob/b0aabd79941efe8b56bcabbd1b43bf497b9b1746/mbrl/replay_buffer.py
Examples:
- Select transitions from highly-rewarding trajectories - this could be used to perform analyses of how data selection impacts MBRL, objective mismatch, etc.
- Select transitions randomly from the replay buffer to have a fixed size of training/validation data.
Motivation
This would make analysis similar to https://arxiv.org/abs/2002.04523 and https://arxiv.org/abs/2102.13651 easy to perform.
Pitch
It should be fairly easy to implement similar to TransitionIterator
and BootStrapIterator
above. (Taking care of trajectory/episodic boundaries could be a bit tricky.)
Thanks @RaghuSpaceRajan . cc'ing @natolambert since this is highly relevant to his work. I think this proposal is the most straightforward way to do this on the data management side.
Yes, I have a version of this in my private repo, I will create a PR soon for it. The way I did it was for associating a "weight" for each transition, but some of the core functionality was a function to "update weights" for each trajectories. When updating the weights, it would be easy to create a ranking or heuristic mapping of some sort.
Related comment, I think it may be worthwhile to have an optional "rich logging" mode, where things like candidate actions, action sequences (plans) at each step, trajectories, and more are saved for every trial in the learning process. It accumulates a lot, but having access to this is useful for debugging.
Feel free to open a feature request issue for this as well @natolambert