Dominik Jain comments

Results 111 comments of


                                            Dominik Jain

Create high level interfaces for config and experiments

I propose that we make the abstraction layer that adds support for other frameworks a separate issue. It can be developed separately from the high-level API as such.

Doesn't Work: Blank Screen

Same issue; I switched to using [slickgpt](https://github.com/ShipBit/slickgpt)

Poetry update the torch versioned from cuda (2.0.1+cu118) to cpu (2.1.1) defaultly on Windows

This is a well-known Poetry limitation. By default, installing torch via Poetry will use a torch build that was built against a default version of CUDA (it is _not_ a...

Poetry update the torch versioned from cuda (2.0.1+cu118) to cpu (2.1.1) defaultly on Windows

> > This is a well-known Poetry limitation. By default, installing torch via Poetry will use a torch build that was built against a default version of CUDA (it is...

Poetry update the torch versioned from cuda (2.0.1+cu118) to cpu (2.1.1) defaultly on Windows

@coolermzb3 I had literally suggested the same solution in this link: > * a cleaner way to do the latter is to [configure a "source" in Poetry and then specify...

Unable to replicate original PPO performance

NOTE: I will update this post as I get further results. # Breakout Experiments Breakout is [reportedly](https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/) a good indicator: "Rule of thumb: 400 episodic return in breakout: Check if...

Unable to replicate original PPO performance

The initial breakout results are in (see comment above). I am now running an experiment with Alien and Asteroids using the latest version, i.e. am simply running `python atari_ppo.py --task...

Unable to replicate original PPO performance

@MischaPanch I suspect `repeat` maps to trainer parameter `repeat_per_collect`? If so, it was set to 4 not 1 (both in our example and in @rajfly's code), so it could have...

Unable to replicate original PPO performance

@rajfly I have experimented with your implementation a bit, adding a [repository](https://github.com/opcode81/ppo-rajfly) for it. I noticed that you did not have a proper testing/evaluation configuration that would allow you to...

Unable to replicate original PPO performance

@rajfly I've now found the issue. After [changing the learning rate scheduler](https://github.com/opcode81/ppo-rajfly/commit/ec2495b4fdb0c81e9fba1c0a8565afa60f1ad943) to the one we use in Tianshou, the problem disappeared. I can now train Alien flawlessly with your...