rlcard icon indicating copy to clipboard operation
rlcard copied to clipboard

DeepCFR convergence

Open perpetualglow opened this issue 4 years ago • 4 comments

Hi, thank you for this project!

I am trying to make use of Deep CFR for a 4 player card game. But training doesn't get anywhere even when I only play with 2 players. I suspect it might not even be possible to converge when 4 players are there.

Have you guys had any luck with applying Deep CFR to one of the games?

perpetualglow avatar Apr 18 '20 22:04 perpetualglow

@JonasBaes We have tried our best to check and tune DeepCFR but we can not make it converged. If you make it, please let us know.

daochenzha avatar Apr 18 '20 23:04 daochenzha

Hi, @daochenzha do you have some clue why deepCFR doesn't converge?

AdrianP- avatar May 03 '20 18:05 AdrianP-

Hi, @AdrianP- based on our tuning experience, we think DeepCFR is very sensitive to the network structure. Maybe tuning the network structure and hyper-parameter might be a direction.

lhenry15 avatar May 04 '20 17:05 lhenry15

It looks like there is a bug in the algorithm. Can you explain what's going on in lines 322-327? Why isn't the step_back called inside the iteration on line 329? Is it the responsibility of self._traverse_game_tree to return self._env to the previous state? https://github.com/datamllab/rlcard/blob/b15c257df55a05df370aa252b0f1991564db3d4e/rlcard/agents/deep_cfr_agent.py#L322-L327 This code looks strange too: https://github.com/datamllab/rlcard/blob/b15c257df55a05df370aa252b0f1991564db3d4e/rlcard/agents/deep_cfr_agent.py#L311-L314 Why not just take the step back in the same block as the step?

cfytrok avatar Oct 11 '20 09:10 cfytrok