ElegantRL icon indicating copy to clipboard operation
ElegantRL copied to clipboard

Massively Parallel Deep Reinforcement Learning. 🔥

Results 156 ElegantRL issues
Sort by recently updated
recently updated
newest added

The following codes show that the policy used to explore the env (generate the action and logprob) is 'self.act', ``` get_action = self.act.get_action convert = self.act.convert_action_for_env for i in range(horizon_len):...

dicussion

1. 运行 tutorial_LunarLanderContinuous_v2.ipynb 报错: 2. ElegantRL\elegantrl\agents\AgentBase.py:58, in AgentBase.__init__(self, net_dim, state_dim, action_dim, gpu_id, args) 56 cri_class = getattr(self, "cri_class", None) 57 print(act_class) ---> 58 **self.act = act_class(net_dim, state_dim, action_dim).to(self.device)** 59 self.cri...

bug

Hi! I am trying to use ElegantRL for multi-agent RL training as it seems very well written. I tried to use MADDPG or MATD3. But none of these agents seem...

bug

seed_rl让神经网络训练提升几十几百倍速度。如果支持的话。那么这个框架就有了能够落实各种大型项目的意义了

Suggestion

## Bugs When I tested the Isaac tutorial with elegantRL (use this [code](https://github.com/AI4Finance-Foundation/ElegantRL/blob/IsaacGym-Single-Process/demo_IsaacGym.py) the author mentioned in [this issue](https://github.com/AI4Finance-Foundation/ElegantRL/issues/169)). I found that **neither the config files nor the asset files...

bug

`AgentPPOHterm`类的`update_net`函数,会解析之前在`train_and_evaluate`里面通过`AgentPPO.explore_one_env`生成的trajectory(并且由于PPO是on_policy算法,所以并不会把trajectory放入replay buffer然后重新采样),但可以看到,`AgentPPOHterm`类的`update_net`函数解析trajectory数据格式和存放时并不一致(和`AgentPPO`类的`update_net`函数也不一致),看起来是个bug AgentPPOHterm类`update_net`函数里的trajectory解析: https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L671 AgentPPO类`explore_one_env`函数里输出的trajectory格式: https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L92 AgentPPO类`update_net`函数里的trajectory解析: https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L139 除了这个解析顺序的问题,`AgentPPOHterm`类`update_net`需要的两个字段`buf_mask`和`buf_noise`似乎不能直接与`undones`和`logprobs`对应?不知道是不是代码实现上还没完成?

bug

When we use SB3 and gym to construct agent and environment, we can define (N,M) dim state space. But in ElegantRL, we can only define the state_dim as int, same...

bug

Elegantrl 0.3.6 is currently failing on pip install against python 3.11.9 on windows The issue is with using pygame==2.1.0 as an enforced requirement on windows OS, which causes elegantrl 0.3.6...

当使用demo_DQN_Dueling_Double_DQN 训练结束的的pt文件无法作为测试时的权重文件 ,是否需要将保存pt文件 由torch.save(actor, actor_path) 更改为torch.save(actor.state_dict(), actor_path)