jtwin

Results 4 issues of jtwin

In the paper and open spiel implementation, the neurd clip value is set to 10k. https://github.com/google-deepmind/open_spiel/blob/931e39a99ee73412500def0227925f8f19f033fe/open_spiel/python/algorithms/rnad/rnad.py#L604 From my testing, this is a source of major instability since the vtrace operator...

In the example for RNaD, the importance sampling correction for get_loss_nerd is 1. This is because the example provided is the on-policy case, and there are synchronous updates of the...

Based on formulae from the paper, the reward transformation is given by adding the log policy ratio ![image](https://github.com/deepmind/open_spiel/assets/72776130/5b21ae33-48be-4bfd-9f67-3c1aa3f88318) However, the code contains an entropy term instead. https://github.com/deepmind/open_spiel/blob/db0f4a78b1fd0bee0263d46d62fb4d693897329e/open_spiel/python/algorithms/rnad/rnad.py#L422 Which one is...

I am running a local showdown server for use in a deep reinforcement learning algorithm. I am noticing that the showdown server itself is the bottleneck for learning in terms...