DecisionTransformerInterpretability Vectorize get trajectory minibatches method of memory class (useful for TrajPPO model)

Vectorize get trajectory minibatches method of memory class (useful for TrajPPO model)

Open jbloomAus opened this issue 1 year ago • 0 comments

I recently wrote a version of get_minibatches in the memory class of the ppo subpackage.

https://github.com/jbloomAus/DecisionTransformerInterpretability/blob/c84edb381c53b3f9ef2fa9517e34914a52e15fbd/src/ppo/memory.py#L210-L393

TLDNR: This is important for sampling sections of trajectories which is necessary for online training of trajectory models as opposed to models which only respond to the latest observation. I have a few ideas for what to do here:

Keep the logic more or less the same, but vectorize it. It's way to serialized and it doesn't have to be. Obviously write lots of tests.

Mar 19 '23 09:03 jbloomAus

DecisionTransformerInterpretability DecisionTransformerInterpretability copied to clipboard

Vectorize get trajectory minibatches method of memory class (useful for TrajPPO model)

DecisionTransformerInterpretability
DecisionTransformerInterpretability copied to clipboard