Zihan Ding comments

Results 15 comments of


Zihan Ding

Error with webpage

I didn't see the website either after installation.

Why the negative value causes failure in actor loss? You can also refer to OpenAI baselines [here](https://github.com/openai/baselines/tree/master/baselines/ppo1), which has similar process as our repo.

Bug of PPO

Sorry for the late reply. What you mentioned might be caused by some numerical issues in tf.minimum if I understood correctly. Could you please print out an example case and...

modify RL examples to TF2 TL2

Hi, I've cleaned the code and changed the log. Thanks

Results using RLBench as the environment

Hi, I would expect that the end-to-end training with RLzoo algorithm on RLBench can be hard in practice. As you said, it seems RLBench provides the reward value of either...

Results using RLBench as the environment

Hi guys, I tried to replicate the problem you met, but it doesn't happen from my side. I use PPO-Clip algorithm on *ReachTarget* environment in RLBench and the robot is...

Results on Box2D environments

Hi, Did you use the default hyper-parameters provided in RLzoo? If so, we will take a look into this problem.

Does RLzoo support Dict gym env state?

Hi, It supports dict state, but you need a wrapper for your env. Please take a look at the FlattenDictWrapper (./common/env_wrappers.py) for robotics env.

Error When Training Agent

This problem can be fixed at the ElegantRL side, by change [this line](https://github.com/AI4Finance-Foundation/ElegantRL/blob/4ae8351f88965cc64ee5ac56d80c847e45e8215d/elegantrl/agents/AgentBase.py#L282) to be: ``` traj_list1 = list(map(list, zip(*traj_list))) # state, reward, done, action, noise ``` However, there are...

list(map()) bug

just replace `states, rewards, masks, actions = [torch.cat(item, dim=0) for item in traj_items]` with `[states, rewards, masks, actions] = traj_list` since the samples are already shaped.