Adam Yanxiao Zhao comments

Results 7 comments of


                                            Adam Yanxiao Zhao

Poor Evaluation Performance in PPO

Thank you so much for your response, this indeed is an intriguing issue. I have gone through #310 and it makes sense that we need to save `obs_rms` and `return_rms`....

Poor Evaluation Performance in PPO

I want to confirm whether we need to save the `NormalizeReward_wrapper`. If we don't save it, our policy won't be able to continue training after being downloaded and can only...

handle num_envs > 1 in DQN

Here we may need to pay attention to the problem of the increment of `global_step` (`+1`? or `+args.num_envs`?) - If `+args.num_envs`, then we need to consider the divisibility relationship between...

Enhancing Termination and Truncation Handling in CleanRL's PPO Algorithm

Thank you for this PR! I had previously noticed the issue and made a simple modification to the CleanRL PPO by referencing the fix done in https://github.com/DLR-RM/stable-baselines3/pull/658. Here is my...

TypeError: file must have 'read', 'readinto' and 'readline' attributes

I've also come across some quirky bugs in the latest version 😬, which I suspect got slipped in during code refactoring. Going through the [init commit](https://github.com/google-research/reincarnating_rl/tree/3eb419c43685d8f8f46cb20960c956ae1c19f91b) from git, it appeared...

Fix README and run problem

WIP

Run script problem

Thank for you reply! I modified `JaxFullRainbowAgent.epsilon_fn` in file `reincarnating_rl/configs/qdagger_drq.gin` and file `reincarnating_rl/configs/qdagger_rainbow.gin` to `@loss_helpers.reincarnation_linearly_decaying_epsilon`, but it still didn't work. Error Infos: ```shell 2023-03-10 09:52:30.681166: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary...