adventures-in-ml-code
adventures-in-ml-code copied to clipboard
Policy Gradient Issue: ValueError: Shapes (20, 1) and (20, 2) are incompatible
Hi.
The code Code is not working with this line: loss = network.train_on_batch(states, discounted_rewards).
Try this... it should work...
target_actions = np.array([[1 if a==i else 0 for i in range(2)] for a in actions]) loss = network.train_on_batch(states,target_actions, sample_weight=discounted_rewards)