iuri frosio

Results 18 comments of iuri frosio

1. We cannot answer to this without experimenting. The best approach may be to do a grid search and make sure that dynamic scheduling is close to optimal, as we...

It's hard to comment as you are doing the experiments in first person, but I'll try anyway. The possible experiments that come to my mind are, not necessarily in the...

Based on the previous comment, your agent seems to be short-sighted. It cannot see rewards in a far future... Maybe because he is receiving rewards at every frame, and fall...

Hi, thanks for noticing this. Our implementation of A3C is indeed coherent with the original algo (see https://arxiv.org/pdf/1602.01783.pdf, page 14, where the reward is set to 0 for a terminal...

It is also interesting understanding if the number of agents is increasing during training. That may explain the increase in CPU usage. Sent from my iPhone Sory ForSpell Ing hErRRors...

A3C for GPU is not distributed on this page.

Hi, can you provide some more details? Are you using the basic version of the algorithm with Pong? Automatic scheduling enabled? How many agents / trainers / predictors? Is the...