Jiayi Weng comments

Results 305 comments of


                                            Jiayi Weng

EnvPool advertisement

Let me set up the discord server so that we can talk the detail there.

Another request: I'm trying to [use mujoco source code to build envpool](https://github.com/sail-sg/envpool/pull/141). However, there are some small precision issues (https://github.com/deepmind/mujoco/issues/294). The corresponding wheels are in https://github.com/sail-sg/envpool/actions/runs/2381544251 Not sure if it...

In PPOPolicy, the ratio is computed with requires_grad `True`.

Feel free to submit a pull request to fix that issue.

In PPOPolicy, the ratio is computed with requires_grad `True`.

I guess it may because -1e18 is too large so it affects other weights of network?

env.step API updated in gym 0.25.0

As long as https://github.com/openai/gym/pull/3019 is not merged, it's okay not to restrict the gym version.

Batch size limited by number of environment

10 is from `batch_size` of https://github.com/thu-ml/tianshou/blob/278c91a2228a46049a29c8fa662a467121680b10/tianshou/policy/modelfree/ppo.py#L111 https://github.com/thu-ml/tianshou/blob/278c91a2228a46049a29c8fa662a467121680b10/tianshou/policy/base.py#L277 https://github.com/thu-ml/tianshou/blob/278c91a2228a46049a29c8fa662a467121680b10/tianshou/trainer/onpolicy.py#L131-L136 Could you please print `len(batch)` at the beginning of PPOPolicy.learn function to see what happens?

Jiayi Weng

EnvPool advertisement

EnvPool advertisement

In PPOPolicy, the ratio is computed with requires_grad `True`.

In PPOPolicy, the ratio is computed with requires_grad `True`.

env.step API updated in gym 0.25.0

Batch size limited by number of environment

I trained my env using "tianshou/test/continuous/test_redq.py", but there have some bugs

I trained my env using "tianshou/test/continuous/test_redq.py", but there have some bugs

I trained my env using "tianshou/test/continuous/test_redq.py", but there have some bugs

I trained my env using "tianshou/test/continuous/test_redq.py", but there have some bugs