Hi there, suppose I have a pytorch model, and I want to use it to generate trajectories for reinforcement learning, for example REINFORCE. The code in python should be like...
Hi, I'm reading your document and find a conflict of Plane with Solo and Quad with Solo. In wikipedia, these 2 categories should have kickers different from each other. >...
I don't think "Suits are irrelevant in DouDizhu". For example, the three landlord cards may contain a spade 8, while landlord could hold another diamond 8. If the landlord plays...
Hi, I'm interested in the paper of alphamu search ([link](https://arxiv.org/pdf/1911.07960) and [link](https://arxiv.org/pdf/2101.12639)). It's an algorithm based on alpha-beta and applied to the game of bridge. Are you planning to implement...
论文中提到imperfect feature是23\*12\*15 + 6\*1=4146, perfect feature是25\*12\*15+8\*1=4508, 但是得到的obs["x_no_action"]的长度是4688, 而且最后几位还是小数, 能问下这是为什么吗?