Mava icon indicating copy to clipboard operation
Mava copied to clipboard

Feature: Observations stacking

Open ulricharmel opened this issue 2 years ago • 1 comments

What?

  • Add an environment wrapper that stacks new observation to past observations for training.

Why?

  • This is one of the suggestion in Yu et al (2021) for Atari environments. Remembering past observations can help the agent choose a better policy.

How?

  • Added a new wrapper class that uses a deque list to store past observations.
  • During reset, we just repeat the same observation for the required number of frames.
  • During a step, we delete the oldest obs (add the top of the queue) and add the most recent one to the end of the queue.

Extra

  • Closes #788

ulricharmel avatar Oct 11 '22 16:10 ulricharmel

Codecov Report

Merging #793 (6f92bc7) into develop (999d31e) will not change coverage. The diff coverage is n/a.

@@           Coverage Diff            @@
##           develop     #793   +/-   ##
========================================
  Coverage    93.12%   93.12%           
========================================
  Files          167      167           
  Lines         9253     9253           
========================================
  Hits          8617     8617           
  Misses         636      636           

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Oct 11 '22 16:10 codecov[bot]