irl-lab icon indicating copy to clipboard operation
irl-lab copied to clipboard

Some IndexError occured when I run the sample usage code.

Open ShenTw opened this issue 6 years ago • 1 comments

Hello irl-lab : ) I encountered some IndexError below. I tried to change those indices into Integer, but it didn't work. Is there anything I didn't notice ? Thanks a lot : )

IndexError                                Traceback (most recent call last)
<ipython-input-26-533cf5bf2336> in <module>()
      7 # Obtain the optimal policy for the environment to generate expert demonstrations
      8 pi = PolicyIteration(gw)
----> 9 optimal_policy = pi.policy_iteration(100)
     10 
     11 expert_demos = gw.generate_trajectory(optimal_policy)

C:\Documents\irl-lab-master\algo\PolicyIteration.py in policy_iteration(self, num_iters)
     16 
     17     def policy_iteration(self,num_iters=10):
---> 18         print("pi_")
     19         for i in range(num_iters):
     20             self.policy_evaluation()

C:\Documents\irl-lab-master\algo\PolicyIteration.py in policy_evaluation(self, num_iters, gamma)
     11             transition_probs = np.zeros((self.env.num_states,self.env.num_states))
     12             print(self.env.num_states)
---> 13             for j in range(self.env.num_states):
     14                 transition_probs[int(j)] = self.env.get_transition_probabilities(j,self.policy[int(j)])
     15             self.values = self.env.get_rewards() + gamma*np.dot(transition_probs,self.values)

C:\Documents\irl-lab-master\env\GridWorld.py in get_transition_probabilities(self, state, action)
     68             transition_probs[state_coords1,int(min(self.grid_size-1,state_coords2+1))] += 0.1
     69         elif action == "right":
---> 70             transition_probs[int(max(0,state_coords1-1)),state_coords2] += 0.1
     71             transition_probs[state_coords1,int(max(0,state_coords2-1))] += 0.1
     72             transition_probs[int(min(self.grid_size-1,state_coords1+1)),state_coords2] += 0.1

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices```

ShenTw avatar Aug 27 '18 08:08 ShenTw

It looks like you have modified the implementation of get_transition_probabilities inside GridWorld.py. This is how it looks inside the repository.

I can still help you with the specific problem that you are facing, if you could share with me how you are assigning the variables state_coords1 and state_coords2.

aravindsiv avatar Aug 30 '18 10:08 aravindsiv