mbrl-lib icon indicating copy to clipboard operation
mbrl-lib copied to clipboard

Dreamer

Open Rohan138 opened this issue 2 years ago • 19 comments

Types of changes

Still a WIP, but I've managed to add most of Dreamer. The main thing left is computing the loss in planning/dreamer_agent/DreamerAgent::train().

  • [X] Docs change / refactoring / dependency upgrade
  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [X] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Motivation and Context / Related issue

How Has This Been Tested (if it applies)

Checklist

  • [ ] The documentation is up-to-date with the changes I made.
  • [x] I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).
  • [ ] All tests passed, and additional code has been covered with new tests.

Rohan138 avatar Jun 05 '22 06:06 Rohan138

Hi @Rohan138, took a quick high-level glance and so far it looks good. I will start looking at the code in algorithms/dreamer and the DreamerAgent more closely now.

I noticed you added some changes to other files like the pre-commit config, the requirements, pyproject.toml, etc. Were you running into some errors? If that's the case, would you mind opening a separate PR for these?

luisenp avatar Jun 10 '22 14:06 luisenp

@Rohan138 trying to pull from your fork as shown below, but I run into access permission errors. Could you see if I can get read access?

git checkout -b Rohan138-dreamer main
git pull [email protected]:Rohan138/mbrl-lib.git dreamer

luisenp avatar Jun 10 '22 15:06 luisenp

I can add you to my fork, but do you want to add my repo as a remote and try git checkout -t Rohan138/dreamer?

Rohan138 avatar Jun 10 '22 15:06 Rohan138

git checkout -t Rohan138/dreamer

I tried a couple of things and I got errors:

image

image

luisenp avatar Jun 10 '22 18:06 luisenp

Just sent you a contributor invite; maybe https://github.com/Rohan138/mbrl-lib.git would've worked better than git@github?

Rohan138 avatar Jun 10 '22 18:06 Rohan138

HTTP worked for me before accepting the invitation, thanks!

luisenp avatar Jun 10 '22 19:06 luisenp

On the non-Dreamer fixes:

  • I renamed pyproyect.toml to pyproject.toml
  • I removed the lines pinning python==3.7 on all commits
  • There's no black version 21.4b2 on pypi, so I upgraded to the latest stable version instead

I can definitely move these to a different PR, though.

Rohan138 avatar Jun 11 '22 19:06 Rohan138

On the non-Dreamer fixes:

  • I renamed pyproyect.toml to pyproject.toml
  • I removed the lines pinning python==3.7 on all commits
  • There's no black version 21.4b2 on pypi, so I upgraded to the latest stable version instead

I can definitely move these to a different PR, though.

Another PR for these would be great, so that we can merge them w/o waiting for this more involved PR to be ready. Thanks!

luisenp avatar Jun 13 '22 20:06 luisenp

@Rohan138 planning to spend most of today and then Friday playing around with your code. Is there anything in particular you'd like for me to focus on or help with? It seems I'm able to run Dreamer, but I haven't checked if it learns correctly yet. What's the current status?

luisenp avatar Jun 15 '22 15:06 luisenp

Hi! So I tried running it, but despite the losses dropping, it doesn't seem to learn right now. Here are the results; I'm planning to look through the agent.train() implementation over the weekend to figure out what's going on.

Rohan138 avatar Jun 15 '22 18:06 Rohan138

I'm thinking about starting a project that builds off the Dreamer style of dynamics model. If I spend a couple hours here and there on this PR, what would be most useful?

natolambert avatar Aug 02 '22 21:08 natolambert

Sorry for the delay-I'll try to address all of the comments across the next day or so. I moved the non-Dreamer fixes to #161.

The compute_return is taken from this PyTorch implementation, which takes it from the original TF1 code. A unit test for it is probably a good idea, though.

@natolambert-The core dreamer implementation is in the DreamerAgent.train(...) function here. If you'd like, could you look through it?

Rohan138 avatar Aug 03 '22 06:08 Rohan138

@luisenp @Rohan138 -- is there anything I or @RajGhugare19 can do to get this moving again?

natolambert avatar Nov 29 '22 22:11 natolambert

@luisenp @Rohan138 -- is there anything I or @RajGhugare19 can do to get this moving again?

Hi @natolambert. Unfortunately, it's almost impossible for me at this point to take the lead in development, due to other more pressing commitments. But I'm happy to support with reviews, general advice, and some amount of coding, if someone else is willing to drive this feature to completion.

luisenp avatar Nov 30 '22 13:11 luisenp

Gotcha, so I'm guessing it's at the point where there are small issues and need to verify performance? @luisenp I just wanted to make sure there wasn't any more issues / blocking problems I missed skimming it.

natolambert avatar Nov 30 '22 17:11 natolambert

Gotcha, so I'm guessing it's at the point where there are small issues and need to verify performance? @luisenp I just wanted to make sure there wasn't any more issues / blocking problems I missed skimming it.

There were some comments I left early that I'm not sure were addressed (mostly high level stuff). But leaving that aside I don't think the implementation was fully working yet, @Rohan138 would have more details though.

luisenp avatar Nov 30 '22 18:11 luisenp

Great. I want to take a look, and I have chatted with @danijar who didn't know it was being worked on. Let's see if I can un-stick it and if needed talk to Danijar.

natolambert avatar Nov 30 '22 18:11 natolambert

Hello @natolambert -- I can take a lead developing this. You can review and sanity check the code afterwards. I will take a deeper look at the code and what changes are still required today.

RajGhugare19 avatar Dec 01 '22 05:12 RajGhugare19

Pitching in-I can help answer questions and debug the implementation over the weekend. The main function I'm unsure about is the DreamerAgent.train(...) function here.

There's also some minor conflicts due to gym versions and gym's type checking that seem to be breaking CI; there's an open PR #161 here.

Rohan138 avatar Dec 01 '22 14:12 Rohan138