wolpertinger_ddpg
wolpertinger_ddpg copied to clipboard
Wolpertinger Training with DDPG (Pytorch), Deep Reinforcement Learning in Large Discrete Action Spaces. Multi-GPU/Singer-GPU/CPU compatible.
I send email to you on 2022/07/22 Please check. Email is qq.com right ? Please contact with me thx.
When I change the k_ration in agrs to generate multiple actions, AssertionError shows as: Traceback (most recent call last): File "/Users/xx/Downloads/DROO-master/mec/rlmodel/LDAS/main.py", line 211, in train(args.train_iter, agent, env, evaluate, File "/Users/xx/Downloads/DROO-master/mec/rlmodel/LDAS/main.py",...
Hi, If not all actions are valid, how can I apply action mask to filter invalid actions? Do you have any good idea to modify your code?
I believe that the episode_steps and episode_reward should be equal to zero after each episode finish