AI-blog icon indicating copy to clipboard operation
AI-blog copied to clipboard

Accompanying repository for Let's make a DQN / A3C series.

Results 10 AI-blog issues
Sort by recently updated
recently updated
newest added

I am trying to adapt your code to train the agent to play breakout. I tried to use both the CartPole-basic file as well as the Seaquest-DDQN-PER file but the...

First, what are EPS_START, EPS_STOP, and EPS_STEPS? If I want episodes to last until the game naturally terminates an episode, how would I modify these? Could I just set EPS_STEPS...

Given that the OpenAI Gym environment [MountainCar-v0](https://github.com/openai/gym/blob/master/gym/envs/classic_control/mountain_car.py) ALWAYS returns -1.0 as a reward (even when goal is achieved), I don't understand how DQN with experience-replay converges, yet I know it...

In line 202 of A3C version, R is divided by GAMMA. But in line 211, R is not. According to your blog: R_1=(R_0 - r_0+gamma^n * r_n)/gamma I think line...

Although, I tried and tested it on terminal with python3.6 where it runs successfully, the indentation in "files changed" looks a bit off.

Brain class is using global stateCnt and actionCnt instead of local ones.

Used `threading` library used in A3C example is not really concurrent. See https://docs.python.org/3/library/threading.html. > In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at...

I could be wrong but it does not seem that you are annealing the bias with important sampling as suggested in the PER paper(section 3.4). w_i = (1/N * 1/P(i))^beta...

Hello, first of all, congrats for the article. I'm using it for study, and i'm trying to run your code to better undestanding. So, i have some questions: - Do...