rlcard icon indicating copy to clipboard operation
rlcard copied to clipboard

How to access the DeepCFR agent? Not seeing it in the Agents anymore + Other questions about scoping new game addition

Open DeepTitan opened this issue 1 year ago • 2 comments

Hey guys! I am attempting to repurpose the library for my favorite TCG. Would be happy to add my contributions once I'm done since this game has two different decks for each player. I want to train using Deep CFR since thats what I understand to be the state of the art as far as infrastructure goes. I have made a pretty good first draft of modeling the state, action, reward of the game.

I have a few questions:

  1. Why is DeepCFR not in the agents anymore? I noticed that on this branch back in 2021 https://github.com/datamllab/rlcard/blob/34830b6b41fed0ab9119e67c02eab2898efe3c5d/tests/agents/test_deepcfr.py

but after that one I see that it was removed. Any particular reason for that or is it somewhere I am just not seeing?

  1. Any advice for me for actually training this thing? I am proficient in AWS. Any tips/ notes for me when it comes to actually trying to train something that approaches super human ability? I read that the Bridges supercomputer, with 196 nodes and each node having 128 GB of memory, has a total of 25,088 GB (or approximately 25.09 TB) of RAM, which was used to train some super human No Limit Poker.

What instance types, instance count, length of training, 3rd party libraries, ect would you recommend in order to make the distributed training effective? What do you think the estimated cost will be to train and what should I expect when it comes to experimentation and failures?

DeepTitan avatar Dec 25 '23 10:12 DeepTitan

Same question. BTW, thank you for pointing out the position that it used to exist.

Hey guys! I am attempting to repurpose the library for my favorite TCG. Would be happy to add my contributions once I'm done since this game has two different decks for each player. I want to train using Deep CFR since thats what I understand to be the state of the art as far as infrastructure goes. I have made a pretty good first draft of modeling the state, action, reward of the game.

I have a few questions:

  1. Why is DeepCFR not in the agents anymore? I noticed that on this branch back in 2021 https://github.com/datamllab/rlcard/blob/34830b6b41fed0ab9119e67c02eab2898efe3c5d/tests/agents/test_deepcfr.py

but after that one I see that it was removed. Any particular reason for that or is it somewhere I am just not seeing?

  1. Any advice for me for actually training this thing? I am proficient in AWS. Any tips/ notes for me when it comes to actually trying to train something that approaches super human ability? I read that the Bridges supercomputer, with 196 nodes and each node having 128 GB of memory, has a total of 25,088 GB (or approximately 25.09 TB) of RAM, which was used to train some super human No Limit Poker.

What instance types, instance count, length of training, 3rd party libraries, ect would you recommend in order to make the distributed training effective? What do you think the estimated cost will be to train and what should I expect when it comes to experimentation and failures?

ZJsheep avatar May 26 '24 14:05 ZJsheep

Hey guys! I am attempting to repurpose the library for my favorite TCG. Would be happy to add my contributions once I'm done since this game has two different decks for each player. I want to train using Deep CFR since thats what I understand to be the state of the art as far as infrastructure goes. I have made a pretty good first draft of modeling the state, action, reward of the game.

I have a few questions:

  1. Why is DeepCFR not in the agents anymore? I noticed that on this branch back in 2021 https://github.com/datamllab/rlcard/blob/34830b6b41fed0ab9119e67c02eab2898efe3c5d/tests/agents/test_deepcfr.py

but after that one I see that it was removed. Any particular reason for that or is it somewhere I am just not seeing?

  1. Any advice for me for actually training this thing? I am proficient in AWS. Any tips/ notes for me when it comes to actually trying to train something that approaches super human ability? I read that the Bridges supercomputer, with 196 nodes and each node having 128 GB of memory, has a total of 25,088 GB (or approximately 25.09 TB) of RAM, which was used to train some super human No Limit Poker.

What instance types, instance count, length of training, 3rd party libraries, ect would you recommend in order to make the distributed training effective? What do you think the estimated cost will be to train and what should I expect when it comes to experimentation and failures?

I seem to find out the reason. The authors say they cannot make DeepCFR converge in this ancient issue: https://github.com/datamllab/rlcard/issues/38

ZJsheep avatar May 27 '24 14:05 ZJsheep