deep-rl-tensorflow icon indicating copy to clipboard operation
deep-rl-tensorflow copied to clipboard

TensorFlow implementation of Deep Reinforcement Learning papers

Results 20 deep-rl-tensorflow issues
Sort by recently updated
recently updated
newest added

Hello, I tried to reproduce the result (with n_action_repeat 1) on the computer with GTX 1080, however the performance is not as good as shown in the figure. After 2.88...

No matter I am doing training or testing, there is a warning like ` [!] Load FAILED: checkpoints/Breakout-v0/env_name=Breakout-v0/agent_type=DQN/batch_size=32/beta=0.01/data_format=NHWC/decay=0.99/discount_r=0.99/double_q=False/ep_end=0.01/ep_start=1.0/gamma=0.99/history_length=4/learning_rate=0.00025/learning_rate_decay=0.96/learning_rate_decay_step=50000/learning_rate_minimum=0.00025/max_delta=None/max_grad_norm=None/max_r=1/min_delta=None/min_r=-1/momentum=0.0/n_action_repeat=1/network_header_type=nips/network_output_type=normal/observation_dims=80,80/random_start=True/t_ep_end=1000000/t_learn_start=50000/t_target_q_update_freq=10000/t_test=10000/t_train_freq=4/t_train_max=500000/use_cumulated_reward=False/` Actually there is indeed an output file generated in the path of...

Got an InvalidArgumentError after 26 minutes of training. I upgraded to the most recent TensorFlow as suggested and did `$ pip install -U 'gym[all]' tqdm scipy`. I ran this on...

I found there should be some modifications in agent.py. It's strange to use old history when we start playing a new game. ``` for self.t in tqdm(range(start_t, t_max), ncols=70, initial=start_t):...

When running a test with python3 (from readme), numpy.ravel complains for data type: python3 main.py --network_header_type=mlp --network_output_type=normal --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025 [2017-03-20 13:28:03,027] Making...

Traceback (most recent call last): File "main.py", line 172, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 43, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "main.py", line 169, in main agent.play(conf.ep_end) File "../deep-rl-tensorflow/agents/agent.py", line...

Could you please tell me how did you set the reward at each state? It seems that all F states will receive an reward thus an agent might just keep...

I think ddpg can be added, this algorithm performs better for continuous action space. Look forward. :)