rl-action-space-shaping icon indicating copy to clipboard operation
rl-action-space-shaping copied to clipboard

Traininig with rllib

Open Maxwell2017 opened this issue 3 years ago • 5 comments

Hi @Miffyli , I find rllib/configs/vizdoom_ppo.yaml in your repo, is this the config that you have verified which can use ppo algo (whthin RLlib) for training vizdoom? :)

Maxwell2017 avatar Mar 05 '21 06:03 Maxwell2017

Yes, that is the config for PPO with rllib, but note that rllib was only used to run the continuous-control experiments in ViZDoom. For other experiments it used stable-baselines here, with all the arguments being default parameters from argparse here.

Miffyli avatar Mar 05 '21 10:03 Miffyli

Get it ,PPO algorithm solves the problem of continuous action space,Therefore, it is not suitable to use ppo for the discrete motion space scenes such as basic and health gethering in Vizdoom.

Maxwell2017 avatar Mar 05 '21 10:03 Maxwell2017

More or less, yes (see the paper for results). Continuous spaces seem to be much harder to learn than discrete ones, so try to avoid them.

Miffyli avatar Mar 05 '21 10:03 Miffyli

@Miffyli By the way, do you know the DQN training hyperparameters that can works well in other scenes of vizdoom (except basic and health gethering)? :)

Maxwell2017 avatar Mar 08 '21 06:03 Maxwell2017

Sadly no, I have mainly used A2C or PPO for ViZDoom tasks lately. I think the default parameters used for Atari games should work reasonably well out-of-the-box, though :) .

Miffyli avatar Mar 08 '21 11:03 Miffyli