baselines
baselines copied to clipboard
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Dear author, Thank you for provide this useful baselines. It is very useful for my research. But now, I have a question about deepcopy to SubprocVecEnv. In my code, I...
When training ppo2 using mujoco environment, I find that episode reward earned from infos['episode']['r'] doesn't equal to the sum of rewards of each step. In the Humanoid environment, summing up...
When I tried to run the [PPO2 baseline](https://colab.research.google.com/drive/1rU20zJ281sZuMD1DHbsODFr1DbASL0RH#scrollTo=f3AsF_nuTpOj), I encountered this error: `Module 'tensorflow' has no attribute 'set_random_seed'` As I dig deeper I realized that in the TF2 this function...
Hi, On applying PPO2 to a custom Mujoco environment, the policy entropy is continuously increasing even with a small entropy coefficient of 0.01 or even less. In my understanding, ideally...
/home/yxh/anaconda3/envs/tensorenv/lib/python3.6/site-packages/baselines/baselines/run.py 2 places:env._entry_point.split(':')[0].split('.')[-1] change to env.entry_point.split(':')[0].split('.')[-1]
I have trained the PPO2 model on Walker2d-v2 environment with following command with nminibatches=64 python -m baselines.run --alg=ppo2 --env=Walker2d-v2 --num_timesteps=1e6 --seed=30 --network=mlp --num_env=1 --save_path="/home/surabhi/Downloads/github/baselines/result/walker2d/30/ppo2" --log_path="/home/surabhi/Downloads/github/baselines/result/walker2d/30/"  But when i run...
Used random keyword instead of set keyword for Tensorflow 2.
After installation of tf2 version, I tried to run the check command in readme I got the error above > python -m baselines.run --alg=ppo2 --env=Humanoid-v2 --network=mlp --num_timesteps=2e7 Logging to /tmp/openai-2019-10-30-11-49-36-171979...
Signed-off-by: Fabrice Normandin
Hello, I try to write a loop code to test the training effect of DQN agent, which needs to load the model multiple times and reset the environment and tensorflow...