Mava
Mava copied to clipboard

Published 20 hours ago •

Reame
Issues

Feature: Observations stacking

Open ulricharmel opened this issue 2 years ago • 1 comments

What?

Add an environment wrapper that stacks new observation to past observations for training.

Why?

This is one of the suggestion in Yu et al (2021) for Atari environments. Remembering past observations can help the agent choose a better policy.

How?

Added a new wrapper class that uses a deque list to store past observations.
During reset, we just repeat the same observation for the required number of frames.
During a step, we delete the oldest obs (add the top of the queue) and add the most recent one to the end of the queue.

Extra

Closes #788

Oct 11 '22 16:10 ulricharmel

Codecov Report

Merging #793 (6f92bc7) into develop (999d31e) will not change coverage. The diff coverage is n/a.

@@           Coverage Diff            @@
##           develop     #793   +/-   ##
========================================
  Coverage    93.12%   93.12%           
========================================
  Files          167      167           
  Lines         9253     9253           
========================================
  Hits          8617     8617           
  Misses         636      636

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

Oct 11 '22 16:10 codecov[bot]