Pranjal Tandon comments

Results 6 comments of


Pranjal Tandon

Unable to reproduce results on Humanoid-v2 in new SAC

Hmm..... Don't know why this would happen, although I have never tested on humainoid for 10 million steps. The result 8000, on humanoid, is for learned temperature (alpha). For fixed...

Unable to reproduce results on Humanoid-v2 in new SAC

I just ran Humanoid for 10 million steps, and unfortunately cannot reproduce the problems you're observing. Here are the results I see across 2 seeds: ![Screenshot from 2019-09-22 12-07-24](https://user-images.githubusercontent.com/18737539/65383351-fd0af300-dd31-11e9-8824-b0c6e36073a8.png) Maybe...

Unable to reproduce results on Humanoid-v2 in new SAC

> For `--automatic_entropy_tuning = False` 6000 is the expected result. For fixed temperature, the results should be around 6000. You can also check this in the [paper](https://arxiv.org/pdf/1812.05905.pdf)

Resume training

That shouldn't happen. Will look into it. I might, require more detail on how you resume training. (Sorry for the late reply.)

Running SAC: Operation failed to compute its gradient

It is working Git clone again and try

Can I use this in custom gym env?

Sure, it should work custom gym envs. It won't work if the env has discrete actions.