irl-lab
irl-lab copied to clipboard
Some IndexError occured when I run the sample usage code.
Hello irl-lab : ) I encountered some IndexError below. I tried to change those indices into Integer, but it didn't work. Is there anything I didn't notice ? Thanks a lot : )
IndexError Traceback (most recent call last)
<ipython-input-26-533cf5bf2336> in <module>()
7 # Obtain the optimal policy for the environment to generate expert demonstrations
8 pi = PolicyIteration(gw)
----> 9 optimal_policy = pi.policy_iteration(100)
10
11 expert_demos = gw.generate_trajectory(optimal_policy)
C:\Documents\irl-lab-master\algo\PolicyIteration.py in policy_iteration(self, num_iters)
16
17 def policy_iteration(self,num_iters=10):
---> 18 print("pi_")
19 for i in range(num_iters):
20 self.policy_evaluation()
C:\Documents\irl-lab-master\algo\PolicyIteration.py in policy_evaluation(self, num_iters, gamma)
11 transition_probs = np.zeros((self.env.num_states,self.env.num_states))
12 print(self.env.num_states)
---> 13 for j in range(self.env.num_states):
14 transition_probs[int(j)] = self.env.get_transition_probabilities(j,self.policy[int(j)])
15 self.values = self.env.get_rewards() + gamma*np.dot(transition_probs,self.values)
C:\Documents\irl-lab-master\env\GridWorld.py in get_transition_probabilities(self, state, action)
68 transition_probs[state_coords1,int(min(self.grid_size-1,state_coords2+1))] += 0.1
69 elif action == "right":
---> 70 transition_probs[int(max(0,state_coords1-1)),state_coords2] += 0.1
71 transition_probs[state_coords1,int(max(0,state_coords2-1))] += 0.1
72 transition_probs[int(min(self.grid_size-1,state_coords1+1)),state_coords2] += 0.1
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices```
It looks like you have modified the implementation of get_transition_probabilities
inside GridWorld.py
. This is how it looks inside the repository.
I can still help you with the specific problem that you are facing, if you could share with me how you are assigning the variables state_coords1
and state_coords2
.