etienne87 comments

Results 32 comments of


                                            etienne87

LSTM version

Another confusion I have about this, (because little experience with TF). It seems we need 2 graphs : one for prediction (taking a dynamic_rnn), and one (maybe taking a static...

LSTM version

@mbz ok! What I mean : In classic A3C, it seems we can just backprop at the end of an episode (T_MAX), by just re-using the already computed predictions. On...

LSTM version

@ricky1203 : could you perhaps provide an example/ link in context?

LSTM version

@Golly not so much to be honest. Also I think I first need to test idea referred in #16; Otherwise LSTM version will need re-computation of _TMAX_ steps before each...

LSTM version

Coming back to this problem with a slightly more understanding on with variable length rnn : I think the easiest way to code the LSTM version is to keep track...

Anyway, there is a first implementation that works fine if you don't have too much underachieved experiences (of length < Config.TIME_MAX) [here](https://github.com/etienne87/GA3C/blob/lstm/ga3c/NetworkVP.py) I "solved" the issue by padding sequences in...

LSTM version

Hum, Actually there was still an error in my code, I forgot to mask the loss for padding inputs! I propose a first fix [here](https://github.com/etienne87/GA3C/blob/lstm_cartpole/ga3c/NetworkVP.py#L152) Apparently this now works better...

pyTorch

I did a quick trial in one of my [branches](https://github.com/etienne87/GA3C/blob/lstm/ga3c/NetworkVP_torch.py) . Actually, TF is almost twice as fast, because the naive way I did the vectorized loss is probably involving...

pyTorch

interesting @ppwwyyxx ! My naive implementation gives something like this : ![results txt](https://cloud.githubusercontent.com/assets/8342459/24333407/c8eed908-1257-11e7-9bc2-84524197d8d2.png) I am not sure if the problem is in the batching, rather than the explicit calls &...

t_max = 32

I don't understand how the batch can be large and t-max small? You need to accumulate frames during t-max steps before doing a backprop right?