Steven Kapturowski
Steven Kapturowski
@sangjin-park I was checking out commit 452d5735551c672e2ce44740514b105cb045075e and noticed something funny: the ordering of the context window is backwards which I would expect to hurt performance https://github.com/steveKapturowski/tensorflow-rl/blob/452d5735551c672e2ce44740514b105cb045075e/utils/fast_cts.pyx#L305-L308 as compared to...
'render' should only be called if you set the flag '-v 1', so that isn't a strict dependency, but I did notice when trying out GuessingGame-v0 that I'm not properly...
I noticed this as well and believe it's a significant cause of performance degradation. Additionally, you don't seem to be adding the entropy term to the objective which they mention...