ElegantRL
ElegantRL copied to clipboard
Massively Parallel Deep Reinforcement Learning. 🔥
The following codes show that the policy used to explore the env (generate the action and logprob) is 'self.act', ``` get_action = self.act.get_action convert = self.act.convert_action_for_env for i in range(horizon_len):...
1. 运行 tutorial_LunarLanderContinuous_v2.ipynb 报错: 2. ElegantRL\elegantrl\agents\AgentBase.py:58, in AgentBase.__init__(self, net_dim, state_dim, action_dim, gpu_id, args) 56 cri_class = getattr(self, "cri_class", None) 57 print(act_class) ---> 58 **self.act = act_class(net_dim, state_dim, action_dim).to(self.device)** 59 self.cri...
Hi! I am trying to use ElegantRL for multi-agent RL training as it seems very well written. I tried to use MADDPG or MATD3. But none of these agents seem...
## Bugs When I tested the Isaac tutorial with elegantRL (use this [code](https://github.com/AI4Finance-Foundation/ElegantRL/blob/IsaacGym-Single-Process/demo_IsaacGym.py) the author mentioned in [this issue](https://github.com/AI4Finance-Foundation/ElegantRL/issues/169)). I found that **neither the config files nor the asset files...
`AgentPPOHterm`类的`update_net`函数,会解析之前在`train_and_evaluate`里面通过`AgentPPO.explore_one_env`生成的trajectory(并且由于PPO是on_policy算法,所以并不会把trajectory放入replay buffer然后重新采样),但可以看到,`AgentPPOHterm`类的`update_net`函数解析trajectory数据格式和存放时并不一致(和`AgentPPO`类的`update_net`函数也不一致),看起来是个bug AgentPPOHterm类`update_net`函数里的trajectory解析: https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L671 AgentPPO类`explore_one_env`函数里输出的trajectory格式: https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L92 AgentPPO类`update_net`函数里的trajectory解析: https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L139 除了这个解析顺序的问题,`AgentPPOHterm`类`update_net`需要的两个字段`buf_mask`和`buf_noise`似乎不能直接与`undones`和`logprobs`对应?不知道是不是代码实现上还没完成?
When we use SB3 and gym to construct agent and environment, we can define (N,M) dim state space. But in ElegantRL, we can only define the state_dim as int, same...
Elegantrl 0.3.6 is currently failing on pip install against python 3.11.9 on windows The issue is with using pygame==2.1.0 as an enforced requirement on windows OS, which causes elegantrl 0.3.6...
当使用demo_DQN_Dueling_Double_DQN 训练结束的的pt文件无法作为测试时的权重文件 ,是否需要将保存pt文件 由torch.save(actor, actor_path) 更改为torch.save(actor.state_dict(), actor_path)