lker comments

Results 16 comments of


                                            lker

Why is the current sharedRmsprop thread safe?

The async mode is not "thread safe" in the classical sense at all, just happens to work. I didn't see NaNs while I was running long experiments with the latest...

Recurrent Dqn

This is the Caffe implementation from the paper: https://github.com/mhauskn/dqn/tree/recurrent Altough Caffe I never looked at probably will help.

Recurrent Dqn

@Kaixhin I see you started working on this, cool. I'll have some time now, so I'll look at the multigpu and async modes.

Recurrent Dqn

@Kaixhin Awesome! I have no experience with `rnn` either, I will need to study it to have an idea. I have two 980TIs and will be able to run longer...

Recurrent Dqn

@Kaixhin cool, I'll have my hands full with async for now, but in the meantime I'll be able to help with running longer rdqn experiments on my workstation when you...

@Kaixhin I'm not getting the error you mentioned when doing validation on the last batch with size 20 when running `demo`. I'm using the `master` code which has `sequencer:remember('both')` enabled....

Recurrent Dqn

ok so there are two issues: 1. `nn.FastLSTM.usenngraph = true` `nngraph/gmodule.lua:335: split(4) cannot split 32 outputs` this is issue in both `rnn` and `master` 2. `nn.FastLSTM.usenngraph = false` `Wrong size...

Recurrent Dqn

@Kaixhin re: 2. agree, this error is bad, so returning before is not a solution. I'm not sure if learning is bad with the normal batch sizes, could be only...

Recurrent Dqn

@Kaixhin I need to refresh `async` from `master` for the recurrent, should I do a merge or rebase (I'm thinking of merge rather)? Does it even matter when merging back...

Recurrent Dqn

Done the merge and added `recurrent` support for 1-step Q in `async`. This is 7 minutes of training, seems to work well: ![scores](https://cloud.githubusercontent.com/assets/15705158/15519479/f4ae47dc-2201-11e6-9828-67541fcc2b2f.png) Agent sees only the latest frame per...