basic_reinforcement_learning
basic_reinforcement_learning copied to clipboard
about the code
In tutorial1, qlearn_mod_random.py
line 32:
if random.random() < self.epsilon:
minQ = min(q)
mag = max(abs(minQ), abs(maxQ))
# add random values to all the actions, recalculate maxQ
q = [q[i] + random.random() * mag - .5 * mag for i in range(len(self.actions))]
maxQ = max(q)
why use this(versus qlearn.py
)?
I reconstructed your code in a more configurable way if your pardon. The link is mycode, and the question above is still bother me, I appreciate so much if you can give an interpretation.