ElegantRL issues

A policy update bug in AgentPPO?

1

The following codes show that the policy used to explore the env (generate the action and logprob) is 'self.act', ``` get_action = self.act.get_action convert = self.act.convert_action_for_env for i in range(horizon_len):...

huge123

dicussion

ActorFixSAC or AgentBase的init有bug?

1

1. 运行 tutorial_LunarLanderContinuous_v2.ipynb 报错： 2. ElegantRL\elegantrl\agents\AgentBase.py:58, in AgentBase.__init__(self, net_dim, state_dim, action_dim, gpu_id, args) 56 cri_class = getattr(self, "cri_class", None) 57 print(act_class) ---> 58 **self.act = act_class(net_dim, state_dim, action_dim).to(self.device)** 59 self.cri...

flhang

bug

Issue with MADDPG and MATD3

2

Hi! I am trying to use ElegantRL for multi-agent RL training as it seems very well written. I tried to use MADDPG or MATD3. But none of these agents seem...

Gabr1e1

bug

会考虑支持集群训练吗?像Acme,seed_rl

1

seed_rl让神经网络训练提升几十几百倍速度。如果支持的话。那么这个框架就有了能够落实各种大型项目的意义了

xiezhipeng-git

Suggestion

The assets data need to be packaged

3

## Bugs When I tested the Isaac tutorial with elegantRL (use this [code](https://github.com/AI4Finance-Foundation/ElegantRL/blob/IsaacGym-Single-Process/demo_IsaacGym.py) the author mentioned in [this issue](https://github.com/AI4Finance-Foundation/ElegantRL/issues/169)). I found that **neither the config files nor the asset files...

Skylark0924

bug

请问有用于混合动作空间的HPPO算法吗

shengqie

Suggestion

AgentPPOHterm的update_net解析trajectory数据时存在bug？

`AgentPPOHterm`类的`update_net`函数，会解析之前在`train_and_evaluate`里面通过`AgentPPO.explore_one_env`生成的trajectory（并且由于PPO是on_policy算法，所以并不会把trajectory放入replay buffer然后重新采样），但可以看到，`AgentPPOHterm`类的`update_net`函数解析trajectory数据格式和存放时并不一致（和`AgentPPO`类的`update_net`函数也不一致），看起来是个bug AgentPPOHterm类`update_net`函数里的trajectory解析： https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L671 AgentPPO类`explore_one_env`函数里输出的trajectory格式： https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L92 AgentPPO类`update_net`函数里的trajectory解析： https://github.com/AI4Finance-Foundation/ElegantRL/blob/1d5bf9e1639222c5d2a462adcc0c4eab453bbe70/elegantrl/agents/AgentPPO.py#L139 除了这个解析顺序的问题，`AgentPPOHterm`类`update_net`需要的两个字段`buf_mask`和`buf_noise`似乎不能直接与`undones`和`logprobs`对应？不知道是不是代码实现上还没完成？

nicklhy

bug

How to use 2-dimension state space using elegantRL?

When we use SB3 and gym to construct agent and environment, we can define (N,M) dim state space. But in ElegantRL, we can only define the state_dim as int, same...

ZedRover

bug

elegantrl 0.3.6 fails installation via pip on windows os, python 3.11.9 because of pygame 2.1.0 dependency

Elegantrl 0.3.6 is currently failing on pip install against python 3.11.9 on windows The issue is with using pygame==2.1.0 as an enforced requirement on windows OS, which causes elegantrl 0.3.6...

sdk451

A ERROR about test

1

当使用demo_DQN_Dueling_Double_DQN 训练结束的的pt文件无法作为测试时的权重文件，是否需要将保存pt文件由torch.save(actor, actor_path) 更改为torch.save(actor.state_dict(), actor_path)

guest-oo

ElegantRL
ElegantRL copied to clipboard

Metadata

A policy update bug in AgentPPO?

ActorFixSAC or AgentBase的init有bug?

Issue with MADDPG and MATD3

会考虑支持集群训练吗?像Acme,seed_rl

The assets data need to be packaged

请问有用于混合动作空间的HPPO算法吗

AgentPPOHterm的update_net解析trajectory数据时存在bug？

How to use 2-dimension state space using elegantRL?

elegantrl 0.3.6 fails installation via pip on windows os, python 3.11.9 because of pygame 2.1.0 dependency

A ERROR about test

← Metadata

Owner

Metadata

ElegantRL ElegantRL copied to clipboard

Metadata

← Metadata

Owner

Metadata

ElegantRL
ElegantRL copied to clipboard