genrl icon indicating copy to clipboard operation
genrl copied to clipboard

[WIP] Added BCQ

Open sampreet-arthi opened this issue 5 years ago • 6 comments
trafficstars

Stuff implemented:

  • Added BCQ under genrl/agents/offline
  • BCQ inherits from OffPolicyAgentAC. Architecture was very similar to TD3. Major differences were that the actor took in both state and action as input and the VAE obviously.
  • OfflineTrainer inherits from OffPolicyTrainer. Only difference is that it loads the buffer.
  • Refactored buffers and rollouts to inherit from BaseBuffer and remove redundant functions and converted all code to torch. No numpy is used in any of the buffer files now.

Stuff to do:

  • [x] Haven't tested properly yet if it works. Currently created a toy replay buffer from DDPG on Pendulum-v0 with only 100 experiences
  • [ ] Will have to find a simple way to make the actor take in both state and action.

sampreet-arthi avatar Oct 09 '20 18:10 sampreet-arthi

Buffers have been tested but not after the addition of BCQ so tests are failing rn

sampreet-arthi avatar Oct 09 '20 19:10 sampreet-arthi

This pull request introduces 3 alerts when merging 6c271efd645828be529cbc43d2ddad248b199d86 into b8a45ab7fd058d120acd058ddead1db77c9bb616 - view on LGTM.com

new alerts:

  • 1 for Unused local variable
  • 1 for Unused import
  • 1 for Wrong number of arguments in a call

lgtm-com[bot] avatar Oct 09 '20 19:10 lgtm-com[bot]

Codecov Report

Merging #378 into master will decrease coverage by 2.76%. The diff coverage is 58.76%.

@@            Coverage Diff             @@
##           master     #378      +/-   ##
==========================================
- Coverage   91.28%   88.51%   -2.77%     
==========================================
  Files          90       93       +3     
  Lines        3809     3944     +135     
==========================================
+ Hits         3477     3491      +14     
- Misses        332      453     +121     
Impacted Files Coverage Δ
genrl/agents/deep/base/base.py 93.75% <ø> (ø)
genrl/agents/deep/base/onpolicy.py 96.15% <ø> (ø)
genrl/trainers/onpolicy.py 92.00% <ø> (ø)
genrl/agents/offline/bcq/bcq.py 23.86% <23.86%> (ø)
genrl/trainers/offline.py 27.77% <27.77%> (ø)
genrl/core/models.py 33.33% <33.33%> (ø)
genrl/trainers/base.py 81.30% <47.05%> (-6.87%) :arrow_down:
genrl/core/buffers.py 92.94% <91.80%> (-2.30%) :arrow_down:
genrl/core/rollouts.py 96.77% <96.77%> (ø)
genrl/agents/__init__.py 100.00% <100.00%> (ø)
... and 13 more

codecov[bot] avatar Oct 10 '20 18:10 codecov[bot]

This pull request introduces 4 alerts when merging b28c1e650ec4bc19daf7b7c04691a9d4e4a5563c into a2c8c7e137167219ea262db5b56c3197a86e05b0 - view on LGTM.com

new alerts:

  • 3 for Unused import
  • 1 for Signature mismatch in overriding method

lgtm-com[bot] avatar Oct 10 '20 18:10 lgtm-com[bot]

This pull request introduces 4 alerts when merging 3db4733211352506fa5c339a7b40f738a994aa44 into 25eb018f18a9a1d0865c16e5233a2a7ccddbfd78 - view on LGTM.com

new alerts:

  • 3 for Unused import
  • 1 for Signature mismatch in overriding method

lgtm-com[bot] avatar Oct 16 '20 19:10 lgtm-com[bot]

This pull request introduces 4 alerts when merging 43a483ee54b3bd86b7bfd1249115eb76cde9b942 into 25eb018f18a9a1d0865c16e5233a2a7ccddbfd78 - view on LGTM.com

new alerts:

  • 3 for Unused import
  • 1 for Signature mismatch in overriding method

lgtm-com[bot] avatar Oct 16 '20 20:10 lgtm-com[bot]