pytorch-dqn
pytorch-dqn copied to clipboard
Unmatching size and error
Hi, thanks for sharing your wonderful code. But I have met some errors when running it.
-
Inside the line 197~205 from
dqn_learn.py, the size oftarget_Q_valuesand that ofcurrent_Q_valuesdoes not matched well. I have changed tonext_max_q = next_max_q.unsqueeze(-1)for correcting sizes. Also I have changed torew_batch[0]from line 203. -
(IMO) After stacking records in replay buffer, queue action does not work properly. I have changed the line 158 to
action = select_epilson_greedy_action(Q, recent_observations, t), however different action value has queued.
I am still working these but having troubles. Could you help make them right?
Thanks for your question. But I won't be available for a few days. I will revisit it when I have time. Which pytorch version do you use? I haven't updated to latest version. It might be the problem.
@transedward Thanks for your reply. I have tested in Pytorch 0.2.0.post1 (0.2.0.1), Python 3.5.3 with Anaconda and Ubuntu 16.04.
@tegg89 : Checkout #8 . Let us know if it worked/didn't work.