KataGo Is it possible to create dataset a'la 'imageNet' ?

Is it possible to create dataset a'la 'imageNet' ?

Open lukaszlew opened this issue 3 years ago • 5 comments

Hey,

I thought that it could really accelerate NN architecture research if we could create such a dataset. The biggest win is that wide array of researchers would be able to start their research without having to understand existing complex scripts and code.

I understand that a fixed dataset is not as good as the iterative 'XYzero' process, but isn't it likely that the best architecture, initialization, optimizer, ... would be mostly independent of whether the dataset is static or not?

Perhaps we could even setup a benchmark here: https://paperswithcode.com/datasets ?

Oct 19 '22 04:10 lukaszlew

Right. The only problem I can see is: ImageNet is for classification and we can have definitive labels, while Go dataset is for regression, and we can't have definitive labels.

Oct 20 '22 10:10 mega-optimus

There are multiple heads with various targets for value function, policy function and various auxiliary heads (https://arxiv.org/pdf/1902.10565.pdf).
Why not save all (or some) of the labels of all (or some) of the heads used during the regular training?

Oct 21 '22 21:10 lukaszlew

I think it would be fine to release even a small dataset or dataset made of old games.

Oct 23 '22 15:10 lukaszlew

If all you need is a dataset of older games, try here: https://katagoarchive.org/

These contain the full data for older runs, with both g104 and g170 runs reaching far into superhuman levels of strength.

Oct 25 '22 23:10 lightvector

Yes. It seems that this has all the data that is needed.

Oct 26 '22 18:10 lukaszlew

KataGo KataGo copied to clipboard

Is it possible to create dataset a'la 'imageNet' ?

KataGo
KataGo copied to clipboard