DecisionTransformerInterpretability
DecisionTransformerInterpretability copied to clipboard
Bug: Trajectory Dataset contains pre-emptively truncated trajectories from where PPO get's cut off
This is a bug in the trajectorywriter/offline dataset where we end up truncating some trajectories when we finish online training and this leads to having “short” truncated trajectories, which are bad for our data. It would be good to remove them. They are visible in the visualization of the reward over traj-lengths as spots on the x-axis but not at max-length.
A link to the method I use to ensure that these get labelled as truncated to avoid bugs: https://github.com/jbloomAus/DecisionTransformerInterpretability/blob/c84edb381c53b3f9ef2fa9517e34914a52e15fbd/src/utils.py#L59