rlcard DeepCFR convergence

DeepCFR convergence

Open perpetualglow opened this issue 4 years ago • 4 comments

Hi, thank you for this project!

I am trying to make use of Deep CFR for a 4 player card game. But training doesn't get anywhere even when I only play with 2 players. I suspect it might not even be possible to converge when 4 players are there.

Have you guys had any luck with applying Deep CFR to one of the games?

Apr 18 '20 22:04 perpetualglow

@JonasBaes We have tried our best to check and tune DeepCFR but we can not make it converged. If you make it, please let us know.

Apr 18 '20 23:04 daochenzha

Hi, @daochenzha do you have some clue why deepCFR doesn't converge?

May 03 '20 18:05 AdrianP-

Hi, @AdrianP- based on our tuning experience, we think DeepCFR is very sensitive to the network structure. Maybe tuning the network structure and hyper-parameter might be a direction.

May 04 '20 17:05 lhenry15

It looks like there is a bug in the algorithm. Can you explain what's going on in lines 322-327? Why isn't the step_back called inside the iteration on line 329? Is it the responsibility of self._traverse_game_tree to return self._env to the previous state? https://github.com/datamllab/rlcard/blob/b15c257df55a05df370aa252b0f1991564db3d4e/rlcard/agents/deep_cfr_agent.py#L322-L327 This code looks strange too: https://github.com/datamllab/rlcard/blob/b15c257df55a05df370aa252b0f1991564db3d4e/rlcard/agents/deep_cfr_agent.py#L311-L314 Why not just take the step back in the same block as the step?

Oct 11 '20 09:10 cfytrok

rlcard rlcard copied to clipboard

DeepCFR convergence

rlcard
rlcard copied to clipboard