YonV1943 曾伊言

Results 38 comments of YonV1943 曾伊言

Thank you for reporting this bug. We will reproduce the bug and fix it using the following code. (The following cude uses `PPO` and you use `SAC`) https://github.com/AI4Finance-Foundation/FinRL/blob/bacca166e083f7ead676aad6965f6f1004044b5f/finrl/main.py#L62-L76

在PPO中,为了计算一个value值 给PPO去拟合,它要求我们提供一条完整的 轨迹(trajectory) `sar-sar-...-sar-SAR` (为了方便描述,我们将在最后一步 `done=True` 时的`SAR`标记为大写) `splice list` 的作用就是,当我们收集到的轨迹不完整时,它会截取一段完整的轨迹给PPO接下来的训练,即: `sar-sar-...-sar-SAR-sar-...-sar-SAR-sar-sar` 被截取为 `sar-sar-...-sar-SAR-sar-...-sar-SAR`, 且被截取下来的 `sar-sar` 将被暂时缓存起来,等待下一轮完整时再进行计算。 有超参数 `if_use_old_trajectory` 可以控制是否在下一轮 使用 上一轮被截取下来的旧轨迹片段。

Yes. It is possible. For example, see https://github.com/AI4Finance-Foundation/ElegantRL/blob/fb35e25f01c50af61fa4697824025be50b2e53f1/elegantrl/agents/AgentPPO.py#L276 The fucntion `agent.update_net( )` return the training logging. The training process will **print** these information on the terminal. https://github.com/AI4Finance-Foundation/ElegantRL/blob/fb35e25f01c50af61fa4697824025be50b2e53f1/elegantrl/train/evaluator.py#L79-L81 ``` ################################################################################ ID...

I'll write down some short answers and then wait until tonight to refine. --- Stable Baselines 3 SubprocVecEnv is a vectorized environment. Isaac Gym is a vectorized environment too. We...

It is difficult to add RNN (LSTM or GRU) for Reinforcement Learning. The current more mature solution is the unrolled RNN for RL solution of R2D2. We plan to add...

Forgot to say, it is easier to add the _lookback window_ to ElegantRL. We have designed a ReplayBuffer suitable for the lookback window. This function will be added when the...

The following show our discussion, including What are changed? and Why change them?: 下面简单地展示一下我们的讨论,包括改了什么,以及为何要改: [net.py](https://github.com/AI4Finance-Foundation/ElegantRL/blob/master/elegantrl_helloworld/net.py) - `mid_layer_num` -> `num_layer`, the number layer of multilayer perceptron - `def get_action(self, state):` return...

[agent.py](https://github.com/AI4Finance-Foundation/ElegantRL/blob/master/elegantrl_helloworld/agent.py) - remove `env_num`, it is the code of vectorized env. And it should not be here (ElegantRL helloworld) - `ten` --> `tensor` - `self.convert_trajectory(traj_list) ` --> `convert_trajectory_transition()` --- [env.py](https://github.com/AI4Finance-Foundation/ElegantRL/blob/master/elegantrl_helloworld/env.py)...

I know how to fix this bug. You code: ``` class AgentPPO(AgentBase): def __init__(self): super().__init__() self.ClassAct = ActorPPO self.ClassCri = CriticAdv ``` should be ``` class AgentPPO(AgentBase): def __init__(self): self.ClassAct...

Thank you for helping us to find these bugs. I am fixing these bugs. 1. The "IsaacGym-Single-Process branch" don't need attribute `if_use_per`. We will remove this attribute. 2. OK. We...