YonV1943 曾伊言

Results 40 comments of YonV1943 曾伊言

Ok. I plan to open an issue before create a Pull Request. In this way, my PR can be related to the issue, and I can write the description of...

In stable baselines 3, they set `self._last_obs` in their training pipeline `class OffPolicyAlgorithm` or `class OnPolicyAlgorithm` stable baselines 3 在他们的代码中,也有 `self._last_obs` 的设置,如下: ``` self._last_obs = new_obs # Save the unnormalized...

Thanks. I will based on the PR of @shixun404 and upload a Pull Request to fix these bug, which follows the stable version of ElegantRL helloworld. (Wednesday) [the PR of...

In addition: - This bug don't occur in DGX64 server(A100 GPU), 3080GPU or 3090GPU. - This bug don't occur in fork [ElegantRL/tree/IsaacGym-Single-Process](https://github.com/AI4Finance-Foundation/ElegantRL/tree/IsaacGym-Single-Process) - `check_isaac_gym()` can be run normally. https://github.com/AI4Finance-Foundation/ElegantRL/blob/5cf6190d114374b975232f37fcdbae6e10f6e22e/elegantrl/envs/IsaacGym.py#L213 -...

I have provided a demo that shows how to train using the PPO algorithm on the `StockTradingEnv`. I hope it can be helpful to you. https://github.com/AI4Finance-Foundation/ElegantRL/blob/68bf0ea4ef3fb461026ece8897deabb92aeead32/examples/demo_A2C_PPO.py#L325-L326

We have fixed these bugs for the main algorithms (DQN, DoubleDQN, DualingDQN, D3QN, DDPG, TD3, SAC and PPO), the algorithm related to Hterm has not been updated (DDPG_H, PPO_H), **thank...

This fix covers the following. Agents folder. - In AgentXXX.py, in single env and vectorized env mode, make `agent.last_state.shape == (num_envs, state_dim)` to keep the shape of this tensor consistent....

Addition: - use `states, actions, reawrds` instead of `state, action, reward` as the name of tensor. - do not use space as the file name - rename `get_returns` to `get_cumulative_rewards`

是的,谢谢你,我需要去检查一下代码。 👍 这个bug虽然不影响训练,但是会让 训练日志 的输出有错误

I would like to fix the "tutorial is incorrect" you mentioned. Could you please put the link to the code that might be wrong in this issue? > the three...