gym
gym copied to clipboard
[Question] How to verify who is the winner of a game?
Question
Like the Tennis env e.g., the return infos only inlcude the lives, is this the valid live for a RPG game? Im confused about the meanning of the returned value.
Also, Im interested in how to check who is the winner for the sport game? There is only DONE flag to check the end of the game.
Thanks.
An agent only wants to maximise its rewards, for symmetric competitive games then it is normally just the sum of rewards, if positive, then your agent wins, otherwise, the opponent wins
Thanks so much.
BTW, Im testing the PPO algo with Tennis env, I found the rewards increased to -1 then stop to rise. Is it means that my agent lose the game always? It looks like the policy is converted to a local optimal strategy. However, the logs show that the env always stop at 99999 step. Im curious that is there any maximum step limitation for the env?
Also, Is there any ways to evaluate the trained model? Or rendering the trained frame to figure out the real performance of the agent?
From the looks of it, then the optimal solution is a positive value https://arxiv.org/pdf/1710.02298.pdf see page 11 I would look at some rendering of the agent playing the environment to understand what is happening