KataGo
KataGo copied to clipboard
Is it possible to create dataset a'la 'imageNet' ?
Hey,
I thought that it could really accelerate NN architecture research if we could create such a dataset. The biggest win is that wide array of researchers would be able to start their research without having to understand existing complex scripts and code.
I understand that a fixed dataset is not as good as the iterative 'XYzero' process, but isn't it likely that the best architecture, initialization, optimizer, ... would be mostly independent of whether the dataset is static or not?
Perhaps we could even setup a benchmark here: https://paperswithcode.com/datasets ?
Right. The only problem I can see is: ImageNet is for classification and we can have definitive labels, while Go dataset is for regression, and we can't have definitive labels.
There are multiple heads with various targets for value function, policy function and various auxiliary heads (https://arxiv.org/pdf/1902.10565.pdf).
Why not save all (or some) of the labels of all (or some) of the heads used during the regular training?
I think it would be fine to release even a small dataset or dataset made of old games.
If all you need is a dataset of older games, try here: https://katagoarchive.org/
These contain the full data for older runs, with both g104 and g170 runs reaching far into superhuman levels of strength.
Yes. It seems that this has all the data that is needed.