deep-reinforcement-learning
deep-reinforcement-learning copied to clipboard
Show differences from optimal
Show the differences in the mc backjack policy plot from the optimal policy. Just puts some little red X's on the graph which show where your blackjack policy deviates from the optimal policy.
Minor improvement but I found it useful. Here's what it looks like: