cleanrl icon indicating copy to clipboard operation
cleanrl copied to clipboard

Add rnd_ppo.py documentation and refactor

Open yooceii opened this issue 2 years ago • 4 comments

Description

Closes #127

Types of changes

  • [ ] Bug fix
  • [ ] New feature
  • [ ] New algorithm
  • [x] Documentation

Checklist:

  • [x] I've read the CONTRIBUTION guide (required).
  • [x] I have ensured pre-commit run --all-files passes (required).
  • [x] I have updated the documentation and previewed the changes via mkdocs serve.
  • [ ] I have updated the tests accordingly (if applicable).

If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See https://github.com/vwxyzjn/cleanrl/pull/137 as an example PR.

  • [x] I have contacted @vwxyzjn to obtain access to the openrlbenchmark W&B team (required).
  • [ ] I have tracked applicable experiments in openrlbenchmark/cleanrl with --capture-video flag toggled on (required).
  • [ ] I have added additional documentation and previewed the changes via mkdocs serve.
    • [x] I have explained note-worthy implementation details.
    • [ ] I have explained the logged metrics.
    • [x] I have added links to the original paper and related papers (if applicable).
    • [ ] I have added links to the PR related to the algorithm.
    • [ ] I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
    • [ ] I have added the learning curves (in PNG format with width=500 and height=300).
    • [ ] I have added links to the tracked experiments.
  • [ ] I have updated the tests accordingly (if applicable).

yooceii avatar Apr 03 '22 01:04 yooceii

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/vwxyzjn/cleanrl/FVfp6xKi7pTtnaXPFTa7dhPRKnqL
✅ Preview: https://cleanrl-git-rnd-doc-vwxyzjn.vercel.app

vercel[bot] avatar Apr 03 '22 01:04 vercel[bot]

gitpod-io[bot] avatar Apr 03 '22 01:04 gitpod-io[bot]

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
cleanrl ✅ Ready (Inspect) Visit Preview Aug 25, 2022 at 3:56AM (UTC)

vercel[bot] avatar Jun 10 '22 05:06 vercel[bot]

Hey, @yooceii would you mind reverting the formatting change? They make it harder to review and identify the code specifically relating to RND.

image

Formatting change should be done in a larger PR in #167 together...

vwxyzjn avatar Jun 26 '22 23:06 vwxyzjn

Refactor Check (compatible with the performance in the tracked experiment)

I compared the SPS performance of the latest refactor against the old script used in the tracked experiment. I named the old one old_ppo_rnd_envpool.py and ran it and the latest script without initializing observation normalization parameter.

The following screenshot confirms the refactor does not result in a performance difference (and that we did the refactor correctly — making it faster without impacting sample efficiency)

image

vwxyzjn avatar Aug 24 '22 15:08 vwxyzjn