open_spiel RNaD negative loss and barely any correlation of loss with NashConv

RNaD negative loss and barely any correlation of loss with NashConv

Open kingsharaman opened this issue 8 months ago • 7 comments

I am training RNaDSolver on Kuhn Poker with this config:

config = rnad.RNaDConfig(
    game_name="kuhn_poker",
    trajectory_max=10,
    state_representation=rnad.StateRepresentation.INFO_SET,
    policy_network_layers=(128,),
    batch_size=256,
    learning_rate=1e-6,
    seed=42
)

2 things I cannot explain:

There are negative loss values
While loss goes down more or less it does not really correlate with the NashConv metric. In the beginning it does but then it seems to stagnate while NashConv goes down nicely and smoothely reaching 0.008.

The second one is a problem because I cannot calculate NashConv for more complicated games so not sure what should I monitor to be able to identify convergence.

Jun 20 '24 10:06 kingsharaman

open_spiel open_spiel copied to clipboard

RNaD negative loss and barely any correlation of loss with NashConv

open_spiel
open_spiel copied to clipboard