hlsafin

Results 5 issues of hlsafin

I could be wrong about this, but looking at the implementation, it doesn't seem like it's taking in the previous reward alongside state and prev action into the LSTM, no?...

bug

I tried this r2d2 with Gravitar instead of Pacman, and it didn't seem like it was learning. I even increased the buffer limit to 1_000_000 .

When can we expect the release of these models on ACME? Is this in the pipeline for the future? thanks

I know in the paper you didn't mention anything on hard exploration problems on Atari, did you do any on the side? what were the results?

### ❓ Question I like your work! However, training can take a while; are there any plans to incorporate multi-GPU training into the code? Thank you ### Checklist - [X]...

question