lker

Results 16 comments of lker

The async mode is not "thread safe" in the classical sense at all, just happens to work. I didn't see NaNs while I was running long experiments with the latest...

This is the Caffe implementation from the paper: https://github.com/mhauskn/dqn/tree/recurrent Altough Caffe I never looked at probably will help.

@Kaixhin I see you started working on this, cool. I'll have some time now, so I'll look at the multigpu and async modes.

@Kaixhin Awesome! I have no experience with `rnn` either, I will need to study it to have an idea. I have two 980TIs and will be able to run longer...

@Kaixhin cool, I'll have my hands full with async for now, but in the meantime I'll be able to help with running longer rdqn experiments on my workstation when you...

@Kaixhin I'm not getting the error you mentioned when doing validation on the last batch with size 20 when running `demo`. I'm using the `master` code which has `sequencer:remember('both')` enabled....

ok so there are two issues: 1. `nn.FastLSTM.usenngraph = true` `nngraph/gmodule.lua:335: split(4) cannot split 32 outputs` this is issue in both `rnn` and `master` 2. `nn.FastLSTM.usenngraph = false` `Wrong size...

@Kaixhin re: 2. agree, this error is bad, so returning before is not a solution. I'm not sure if learning is bad with the normal batch sizes, could be only...

@Kaixhin I need to refresh `async` from `master` for the recurrent, should I do a merge or rebase (I'm thinking of merge rather)? Does it even matter when merging back...

Done the merge and added `recurrent` support for 1-step Q in `async`. This is 7 minutes of training, seems to work well: ![scores](https://cloud.githubusercontent.com/assets/15705158/15519479/f4ae47dc-2201-11e6-9828-67541fcc2b2f.png) Agent sees only the latest frame per...