Kai Arulkumaran

Results 86 comments of Kai Arulkumaran

Awesome - I'll try and have a look soon or next week! Would you be able to test it to try and replicate one of the results from the paper?...

In my experience the smallest details in a paper can be key to reproducing results - and these may be missing or ambiguous. If anyone is reasonably confident in their...

Looks fine to me but I'll leave it to @lake4790k's discretion.

**Note:** It might be worth subclassing the Heap from [torchlib](https://github.com/vzhong/torchlib) for the priority queue.

Code looks fine, but what is this trying to achieve? If the chosen action may not be deterministically executed in the environment, the agent should still treat it as if...

Got it - can you add a short 2nd paragraph to the [custom docs](https://github.com/Kaixhin/Atari#custom) to make people aware of this modification from the `rlenvs` API, along with a use-case as...

Yep a switch for using a DRQN architecture would be great. For now I'd go for using `histLen` as the number of frames to use BPTT on for a single-frame...

@lake4790k Almost have something working. Disabling [this line](https://github.com/Kaixhin/Atari/blob/rnn/Model.lua#L157) lets the DRQN train, as otherwise it crashes [here](https://github.com/Kaixhin/Atari/blob/rnn/Agent.lua#L462), somehow propagating a batch of size 20 forward but expecting the normal batch...

@lake4790k I'd have to delve into the original paper/code, but it looks like they train the network every step (as opposed to every 4). This seems like it'll be a...

Here's the result of running `./run.sh demo -recurrent true`, so I'm reasonably confident that the DRQN is capable of learning, but I'm not testing this further for now so I'm...