DRL-code-pytorch issues

PPO tricks don't work on PyBullet Envs

Nice experiments on PPO tricks. I've been trying to use PPO on PyBullet Envs but I find many tricks used in this repo are actually detrimental. (I have created my...

arthur-x

The SAC algorithm seems not convergent

1

![Screenshot from 2022-10-21 12-05-48](https://user-images.githubusercontent.com/87897172/197171025-a9de6183-3f85-4c9a-84cc-f65866ba28b9.png) Just like the picture shows, I find the curve fluctuates -120. Actually, I did not change anything, so I am confused about the result.

wangjunyi9999

ppo训练问题

3

你好，感谢你提供得代码，对我来说有很大帮助，但是我在用ppo得时候出现了点问题，我是一个初学者，我在训练得时候发现连续得ppo算法接入到我自定义得环境后他得每个episode得奖励都一模一样，网络给出得动作是不同但相差非常小，不知道为什么哪里出了问题

binbinyouli12

PPO，离散-连续的混合动作空间

请问在离散-连续的混合动作空间（动作变量一个离散一个连续），该怎么使用PPO算法的代码？

Pungjay

ppo-discrete-RNN训练问题

在ppo-discrete-RNN代码里，不是应该要在buffer里面存储RNN的隐层状态吗，然后在更新的时候取出来恢复RNN的状态，我看代码里是每取一个mini-batchsize就reset一下隐层，这是否正确呢

lgzid

DRL-code-pytorch
DRL-code-pytorch copied to clipboard

Metadata

PPO tricks don't work on PyBullet Envs

The SAC algorithm seems not convergent

ppo训练问题

PPO，离散-连续的混合动作空间

ppo-discrete-RNN训练问题

← Metadata

Owner

Metadata

DRL-code-pytorch DRL-code-pytorch copied to clipboard

Metadata

PPO tricks don't work on PyBullet Envs

The SAC algorithm seems not convergent

ppo训练问题

PPO，离散-连续的混合动作空间

ppo-discrete-RNN训练问题

← Metadata

Owner

Metadata

DRL-code-pytorch
DRL-code-pytorch copied to clipboard