baselines icon indicating copy to clipboard operation
baselines copied to clipboard

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Results 127 baselines issues
Sort by recently updated
recently updated
newest added

Dear author, Thank you for provide this useful baselines. It is very useful for my research. But now, I have a question about deepcopy to SubprocVecEnv. In my code, I...

When training ppo2 using mujoco environment, I find that episode reward earned from infos['episode']['r'] doesn't equal to the sum of rewards of each step. In the Humanoid environment, summing up...

When I tried to run the [PPO2 baseline](https://colab.research.google.com/drive/1rU20zJ281sZuMD1DHbsODFr1DbASL0RH#scrollTo=f3AsF_nuTpOj), I encountered this error: `Module 'tensorflow' has no attribute 'set_random_seed'` As I dig deeper I realized that in the TF2 this function...

Hi, On applying PPO2 to a custom Mujoco environment, the policy entropy is continuously increasing even with a small entropy coefficient of 0.01 or even less. In my understanding, ideally...

/home/yxh/anaconda3/envs/tensorenv/lib/python3.6/site-packages/baselines/baselines/run.py 2 places:env._entry_point.split(':')[0].split('.')[-1] change to env.entry_point.split(':')[0].split('.')[-1]

I have trained the PPO2 model on Walker2d-v2 environment with following command with nminibatches=64 python -m baselines.run --alg=ppo2 --env=Walker2d-v2 --num_timesteps=1e6 --seed=30 --network=mlp --num_env=1 --save_path="/home/surabhi/Downloads/github/baselines/result/walker2d/30/ppo2" --log_path="/home/surabhi/Downloads/github/baselines/result/walker2d/30/" ![30](https://user-images.githubusercontent.com/51375621/70596643-65b88c00-1c0c-11ea-92e1-95cdaaa1af76.png) But when i run...

Used random keyword instead of set keyword for Tensorflow 2.

After installation of tf2 version, I tried to run the check command in readme I got the error above > python -m baselines.run --alg=ppo2 --env=Humanoid-v2 --network=mlp --num_timesteps=2e7 Logging to /tmp/openai-2019-10-30-11-49-36-171979...

tf2

Hello, I try to write a loop code to test the training effect of DQN agent, which needs to load the model multiple times and reset the environment and tensorflow...