async-rl
async-rl copied to clipboard
Scores Too Low
I don't get any errors, but when I run play.py for Breakout using your sample weights, it gets scores of only a few points. And if I train the model myself (either from scratch or by resuming training on your saved model) it gets those same low scores. I tried all 3 of your versions: 1-step Q-learning n-step Q-learning A3C
I tested A3C right now and it seems to work fine. Here are scores from 10 games:
Game # 1; Reward 193;
Game # 2; Reward 319;
Game # 3; Reward 270;
Game # 4; Reward 75;
Game # 5; Reward 229;
Game # 6; Reward 292;
Game # 7; Reward 152;
Game # 8; Reward 295;
Game # 9; Reward 364;
Game # 10; Reward 361;
Can you post your scores and versions of gym
, keras
and theano
?
I don't have the exact results but the rewards for the 10 games were all 0 through 5, using your sample weights. Here's what I am using: gym: 0.9.3 (from git) keras: 2.1.2 theano: 1.0.1 Python 3.5.2
This is strange. Can you look how input image for network looks like (check transform_screen
function)? I suspect there might be a problem with channel layout in convolutions or atari_py
screen data.