keras-rl2
keras-rl2 copied to clipboard
DDPGAgent is incompatible with MultiInputProcessor for HandReach-v0 env
DDPGAgent
fails to train on the critic model while using a MultiInputProcessor
within its backward
method, specifically at lines 260-263:
if len(self.critic.inputs) >= 3:
state1_batch_with_action = state1_batch[:]
else:
state1_batch_with_action = [state1_batch]
state1_batch_with_action.insert(self.critic_action_input_idx, target_actions)
This throws the error TypeError: unhashable type: 'slice'
since state1_batch
is a dictionary with three keys, as returned from the processor. It seems that this chunk of code automatically assumes that state1_batch
will be a list instead of a dictionary. The same can be said a few lines down with state0_batch
. I would love to be able to fix this myself, but am unsure why there was a hardcoded 3
in the logic or why the length of the inputs would make a difference. I'd love to understand if someone is willing to explain.
Here is the script: hand_reach.py
Please make sure that the boxes below are checked before you submit your issue. If your issue is an implementation question, please ask your question in the Discord.
Thank you!
-
[x] Check that you are up-to-date with the master branch of Keras-RL. You can update with:
pip install git+git://github.com/wau/keras-rl2.git --upgrade --no-deps
-
[x] Check that you are up-to-date with the master branch of Keras. You can update with:
pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
-
[x] Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short). If you report an error, please include the error message and the backtrace.