adventures-in-ml-code
adventures-in-ml-code copied to clipboard
ValueError: Shapes (31, 1) and (31, 2) are incompatible
loss = update_network(network, rewards, states, actions, num_actions)
loss = network.train_on_batch(states, discounted_rewards)
I also had this issue. If you are on tensorflow 2.2 then downgrading to either 2.0 or 2.1 may fix the issue, it did for me at least. However after fixing that issue the model never converges and after thousands of episodes still gets stuck on a reward value of about 10-20 and an avg loss in the negative thousands... Not sure how to fix this
adding these lines will fix the issue...
target_actions = np.array([[1 if a==i else 0 for i in range(2)] for a in actions]) loss = network.train_on_batch(states,target_actions, sample_weight=discounted_rewards)