rlcard
rlcard copied to clipboard
DeepCFR convergence
Hi, thank you for this project!
I am trying to make use of Deep CFR for a 4 player card game. But training doesn't get anywhere even when I only play with 2 players. I suspect it might not even be possible to converge when 4 players are there.
Have you guys had any luck with applying Deep CFR to one of the games?
@JonasBaes We have tried our best to check and tune DeepCFR but we can not make it converged. If you make it, please let us know.
Hi, @daochenzha do you have some clue why deepCFR doesn't converge?
Hi, @AdrianP- based on our tuning experience, we think DeepCFR is very sensitive to the network structure. Maybe tuning the network structure and hyper-parameter might be a direction.
It looks like there is a bug in the algorithm. Can you explain what's going on in lines 322-327? Why isn't the step_back called inside the iteration on line 329? Is it the responsibility of self._traverse_game_tree to return self._env to the previous state? https://github.com/datamllab/rlcard/blob/b15c257df55a05df370aa252b0f1991564db3d4e/rlcard/agents/deep_cfr_agent.py#L322-L327 This code looks strange too: https://github.com/datamllab/rlcard/blob/b15c257df55a05df370aa252b0f1991564db3d4e/rlcard/agents/deep_cfr_agent.py#L311-L314 Why not just take the step back in the same block as the step?