rlcard
rlcard copied to clipboard
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
本人正基于RLcard作二次开发研究(更换斗地主少部分规则后训练), 由于深度训练第一次接触研究, 知识储备不够,有几个疑问: 1. 训练被中断后, 添加--load_model参数接着训练, 在总时长一样的情况下效果是否跟从0开始训练没有被中断的效果一样? 2. 随着训练时长的不断增加,各项资源的占用不断减少, 尤其是内存从最开始的接近100%,训练13天后内存占用只有30%, 是否正常? 3. 通过评估函数评估效果是发现虽然总体效果不断增加, 但中间发现偶尔有一两天的训练效果反而比一两天前的效果差, 扣除误差因素,以及更换seed评估,都是同样的结果,是否正常? 4. 保存的文件名是否可以理解为训练的局数或者与局数成正比。 在不同类型的显卡,或者不同数量的显卡,训练效率是否可以通过单位时间内保存的文件名之差来评估? 5. 多块显卡训练时, 单独用一张卡训练,其他用作actor时,用作训练的显卡利用率太低, 能否用所有显卡产生actor,再用其中一块做训练?这样做是否训练效率更高?
Thanks for the great lib btw! If I'm interested in adding support for a new game, can I go ahead and make a fork and open a PR myself? Or...
Hi, I trained a DQN model in NO-LIMT and the training session returned me a .PTH file, how can I use it to play against it? P.S. Thank you very...
I am trying to implement the German Schafkopf game using RLCard. I got a running implementation but the agents are NOT learning. So I am looking for support. Schafkopf is...
Hi, while looking through the nolimitholdem code, I found some lines that are not fully understood. [Here](https://github.com/datamllab/rlcard/blob/2f1211ec7839d38604e0d573807dc08da4be742c/rlcard/games/nolimitholdem/round.py#L156) it says that when the raise amount is smaller than the "pot", we...
Good day to you all! Recently I just got familiar with the RLCard package and I found a bug: for the function : **plot_curve(csv_path, save_path, algorithm)** in the **rlcard.utils.logger.py** file,...
Good day to you all! Recently I tried to modify the DQNAgent from the source code and I noticed a bug(but I am not sure if this is a bug):...
Would it be a good idea to include Gong Zhu https://en.wikipedia.org/wiki/Gong_Zhu and Sheng Ji https://en.wikipedia.org/wiki/Sheng_ji into this, as it is popular among Chinese communities (and both are trick-taking games)?