fuyuan-li

Results 10 comments of fuyuan-li

Hi @lanctot and @alexunderch, I’d be happy to take a look at this. I’m new to this codepath, so I may need a bit of time to reproduce the issue...

Hi @lanctot and @alexunderch, want to give a quick update before I submit a PR (and hear your thoughts) Straight answer: Changed a few hyperparameters to make the model converge...

Thank you @lanctot -- I realize Jax implementation has a different issue (as @alexunderch pointed out it is another warning from tf dataset, not a convergence issue. I'm happy to...

Hey @alexunderch, thanks a lot for the detailed explanation and plan!— this is great, and I’m really glad we’re aligning both implementations and revisiting the paper. I’m happy to take...

At first glance, like your t/2T in Jax implementation! I think this is what original paper proposed. Thank you! Will follow up more

Quick update @alexunderch, the refactored torch impl cannot converge as the original impl, using the same hyper set as in #1287 (https://github.com/google-deepmind/open_spiel/issues/1287) Will dig in and keep you posted!

@alexunderch probably a late update -- Yes confirmed too, both impl converged in kuhn poker! exploitability drop to 0.05, policy value converge to theoretical value (-0.06 for player0), tested with...

Thank you @alexunderch and @lanctot Quick comments for us: 1. Convergence and consistency on kuhn in both pytorch and jax are confirmed. 2. Convergence on Leduc in pytorch, based on...

Quick updates: @alexunderch Looks like for Leduc poker game, pytorch and jax impl don't converge. On a set of (the same) 25 seeds, (which is a subset of above 40...

@alexunderch Thanks for the update! I realized I was about 5 commits behind your latest HEAD([508192f](https://github.com/google-deepmind/open_spiel/pull/1408/commits/508192f5ccb57598de856c6e6616f8f7f1a3c7e8) 2 days ago when I first started the simulation. Just now rebased onto your...