alpha-zero-general icon indicating copy to clipboard operation
alpha-zero-general copied to clipboard

How is Elo calculated with agent self-play?

Open dbsxdbsx opened this issue 2 years ago • 0 comments

I came up this question when reading paper of Alpha(Go)Zero, and I know formula to calculate Elo.

But when it comes to self-play, for example, player A and B (which is the same agent) starts with score 1400 for game chess. and after 1 round, A score is 1500 and B score is 1300.

Then... no matter whether the agent is trained or not, it starts to beat itself again, but what is the basic score used to calculate Elo for agent after 2nd round? The 1500, 1300 or some other way? I didn't come up a right way to do so.

In a word, I don't know how data of Elo trend graph is generated with trained steps as x axis and Elo as yaxis, when game is a 2 side zero-sum game and trained with only self-play.

dbsxdbsx avatar May 09 '22 08:05 dbsxdbsx