chainerrl
chainerrl copied to clipboard
Observations passed to an agent must not be overwritten
Some agents e.g. chainerrl.agents.DQN
store a state
argument of act_and_train
without copying. This implementation doesn't always work well, e.g. if env.step
constantly returns an array like env._state
that has a fixed buffer.