fix the probabilities for each action bug

Open fstonezst opened this issue 8 years ago • 0 comments

the probabilities for the best action is 1 - epsilon so the sum of the probabilities for the rest of actions is epsilon and the number of them should be nA-1

May 26 '17 09:05 fstonezst